Camera-Based Navigation System for Blind and Visually Impaired People | |||||||||||||||||||||||||||||||||||||||||||||||||||
Sohag Engineering Journal | |||||||||||||||||||||||||||||||||||||||||||||||||||
Volume 3, Issue 1, March 2023, Page 1-13 PDF (2.08 MB) | |||||||||||||||||||||||||||||||||||||||||||||||||||
Document Type: Original research articles | |||||||||||||||||||||||||||||||||||||||||||||||||||
DOI: 10.21608/sej.2022.155927.1018 | |||||||||||||||||||||||||||||||||||||||||||||||||||
![]() | |||||||||||||||||||||||||||||||||||||||||||||||||||
Authors | |||||||||||||||||||||||||||||||||||||||||||||||||||
Islam Mohamed Kamal ![]() ![]() ![]() | |||||||||||||||||||||||||||||||||||||||||||||||||||
Dept. of Electrical Engineering, Faculty of Engineering, Sohag University, Sohag 82524, Egypt | |||||||||||||||||||||||||||||||||||||||||||||||||||
Abstract | |||||||||||||||||||||||||||||||||||||||||||||||||||
One of the essential aspects of our life as humans is vision, the ability to see and describe the world with our eyes. Therefore, blinds, even visually impaired people, have plenty of hardships in their daily life activities. They may find it difficult to recognize objects or people in front of them. This paper presents an emphasis on developing a real-time system to aid these people using a Raspberry Pi to process pieces of information provided via camera and ultrasonic sensors with a simple feedback mechanism to inform the user through a headphone. Using this system, the blind could walk without the white cane, and most importantly, it reduces the blind's dependence on other people, thus increasing their quality of life. The proposed system works in different scenarios at both indoor or outdoor environment and has multiple modes of operation to help the blind in different situations with minimum hardware to get an affordable device with a satisfactory offline real-time performance. The system mainly utilizes the technology of deep learning with computer vision to efficiently solve common difficulties such as object recognition, face recognition, and reading text in real-time feedback responses. it sends an audio signal for the user informing him about objects with their count and relative position, familiar faces, or text recognized in the captured frame. | |||||||||||||||||||||||||||||||||||||||||||||||||||
Keywords | |||||||||||||||||||||||||||||||||||||||||||||||||||
Visually impaired; Navigation; Travel aid; Assistive system; Computer vision | |||||||||||||||||||||||||||||||||||||||||||||||||||
Full Text | |||||||||||||||||||||||||||||||||||||||||||||||||||
Visual impairments can discourage one from performing his usual activities. It hinders his work, studies, travel, and overall health. Moreover, navigation becomes a real challenge. Unfortunately, the World Health Organization (WHO) derived that about 2.2 billion people of all ages worldwide have a visual impairment or blindness, and at least half of them have a visual impairment that can be prevented [1]. We are concerned with cases where there is no available cure or wearing glasses solutions. The number of these cases is increasing every year according to Vision Loss Expert Group estimations [2] for the global number of people who suffer from visual impairment or blindness over time as shown in Table 1. Table 1. Estimation by Vision Loss Expert Group for the worldwide number of people who are blind or visually impaired over time
There are several attempts developed for blind and visually impaired (BVI) people to help them independently navigate safely and efficiently. Convenient navigation aids such as white canes or guide dogs are cheaper and more cost-effective; however, they require full attention by the user so, they are extremely useful for near-ground obstacle avoidance especially when there is no crowd in safer outdoor environments. With the growth of technologies, smart electronic travel aids (ETAs) have been developed that further help reduces accidents and facilitate movement, thus improving the traveling experience in unfamiliar places in general [3,4]. ETAs can be represented in subcategories such as robotic navigation aids (RNA), smartphone-based systems, and embedded systems in wearable attachments. An RNA example is mostly a hardware-based system as shown in [5,6]. RNAs mainly use friendly blind interfaces in form of smart canes. These are useful as they can be used passively as a normal white cane. These canes are equipped with various sensors such as 3D cameras, ultrasonic or LiDAR sensors, fire or water sensors, global positioning system (GPS) modules, etc. Their limitations are mainly due to their compact size which affects the cost of developing such systems. This is because they should contain several sensing elements to convey sufficient obstacle information for pathfinding and navigation purposes in such a small size. A useful hardware-based product [7] is available but unluckily, it is expensive at 599$ per unit. Unfortunately, most of the existing systems are either hard to adapt, costly, difficult to carry, or sophisticated to use developed features. This reduces the usefulness of the majority of hardware solutions. Another type of ETAs is utilizing the smartphone in order to develop a reliable device not a bulky one with less hardware and hence, relatively low cost. This allows internet of things (IoT) and cloud computation to be the dominant fields of study for developing such systems. According to [8-10] which uses IoT via a Bluetooth assistance application in a smartphone. Since the smartphone is the main computing device, such systems are limited to smartphone sensors. An additional sensor must communicate with the smartphone with some sort of Radio Frequency Identification (RFID) or beacon or external server. This allows easier firmware and software to be updated via the internet. That reveals the major disadvantage for these systems which is the total dependence on beacon and internet signals for communication between outer sensors, the cloud, and the smartphone. Hence, there might be inapplicable for real-time performance. There are software-based available solutions [11-14] with no hardware but a smartphone such as KNFB Reader (one step reader), Tap Tap See, Cash Reader, and Seeing AI. These apps mainly utilize the camera sensor of the smartphone applying some sort of computer vision techniques to process the image/frame and then give feedback to the user describing the image/frame captured. Ignoring the fact that some of these applications are pay-to-use apps, the major disadvantage lies in there is still a need for visual interaction for selecting the app and features from the selection menu provided like that in seeing AI. This makes them perfect solutions for the visually impaired; however, they are not very useful when it comes to blind people. A better approach can be achieved by integrating smart systems into one or more wearable attachments with lightweight sensors to help the blind with common activities. According to [15,16], these systems provide real-time performance and immediate feedback since they are worn by the user. These can contain multiple sensors in different attachments like belts, gloves, glasses, jackets, shoes, etc. in order to acquire different information about the obstacle. Hence, they are suitable for aiding in various cases. The proposed project is wearable glasses that is basically a deep learning computer vision-based system. The challenge developing such an assistive system is to increase the precision for a deep learning convolutional neural network (CNN), you have to use more complex architecture which actually affects the speed. A deep learning architecture determines the number of parameters need to be calculated before the classification. Different object detection architectures are tested; however, for real-time situations, the smaller architecture, the better it is. Thus, a single shot detector (SSD) object detection architecture [17] is used. It gives an optimal balance between speed and accuracy. This proposed system will aid BVI people in three functions: object recognition with relative localization from user with distance threshold calculating, face recognition for family and friends, and reading text which utilizes the optical character recognizer (OCR). The rest of this paper is organized as follows: Firstly, we will be exploring the proposed system overview in Section 2, followed by methodology and results in Section 3. Eventually, the conclusion and ideas for future work will be discussed in Section 4.
The prototype in its simplest form is an assistive system for the visually impaired and blind people through wearable attachments. It is agreed to pick up 3D designed glasses as wearable attachment. However, multiple intelligent attachments can be deployed for different purposes, as stated before. These systems should be interfaced and integrated together with fast sensors to react in the real-time manner. Also, they should help in preventing the user from dangerous indoor or outdoor situations to allow some kind of path planning for BVI people with a cost-effective way. In the proposed system, shown in Fig. 1, we are considering two main subsystems with three different modes of operation to provide the sufficient aid for the BVI people. Although we are considering this system mostly to be used in an indoor environment, it can still be used under outdoor environment. We will firstly begin discussing the functions of each sub system then go through more details for how the three different modes work.
Fig. 1. Proposed system block diagram.
We are using a single camera module rev 1.3 with a cable that is designed specifically for raspberry pi. This module provides 5MP. It supports capturing video in 480p @ 60/90, 720p @ 60fps, and 1080p @ 30fps. However, the frames per second drop due to processing done on each frame. This sensor gives information signals needed for object detection. Also, we are using a single hc-sr04 basic ultrasonic sensor which gives the necessary echo signal to calculate the distance based on the ultrasonic wave triggered. I keep saying a single unit as several units in multiple wearable attachments can give multiple signals for different purposes, please refer to ideas for future work in section 4.
The proposed system has three main modes of operation, as represented in Fig. 2. The BVI user can toggle between the three different modes using a single pushbutton. The distance measuring subsystem, via ultrasonic sensors, that allows the system to provide audible stop warning alerts beyond certain threshold. This subsystem mainly used in mode 0; however, it could be useful to be carried over to mode 1, too.
Fig. 2. Modes of operation.
The entire processes will be explained in this section. The proposed methodology as block diagrams is inspired from [19] with additional functionalities shown in Fig. 4.
Considering the first part which is the object detection mode which is mode 0. Mode 0 can be divided into distance calculation for comparing it to a certain threshold defined in the script, and a combination of two models inspired by ensemble learning; however, unlike ensemble learning, the two models have different outputs, predicting different categories. Distance estimation is done by ultrasonic sensor. Calculations are done every multiple frames for lower power consumption. However, they are much faster (higher calculation frequency per frame) when there is no object presented on the captured frame by the camera. Some factors may affect distance estimation. This is due to the theory of operation of the distance-estimation sensor. Ultrasonic depends on the speed of sound for distance calculation which is actually a variable due to temperature, humidity, and other weather conditions. This variation can affect results, especially in extreme environments; however, actually not taken into consideration developing the distance measurements in the proposed project instead the average sound velocity is considered constant . And hence, the distance can be calculated, by measuring the round-trip time for the triggered wave, as follows . If the distance is less than a threshold value, i.e., 30 centimeters, a stop alert is fired. Therefore, obstacle avoidance is basically the aim of this distance measuring system and that is all about this block. In this mode, deep learning is utilized for object detection. Considering firstly, the pre-trained SSD model A. This model is quantized for faster processing, and since it is pre-trained on a relatively large database, it has a decent generalization error. Meaning the model will perform well in different environmental conditions. In fact, generalization error reduction is the actual goal of any deep learning model. During training, an algorithm, called an optimizer, adjusts some weights and biases on a neural network architecture to obtain the minimum error at training. However, generalization error cannot be controlled. You have to measure it on a test set. Therefore, after some point, reducing training error results in an increment in generalization error. That is the main challenge which is famously called overfitting. Having a small dataset makes your model more prone to overfitting in a specific network. Unfortunately, a huge dataset should be used for training a network architecture from scratch but luckily, we can use some knowledge from the pre-trained model as we do not have a lot of images for each class. This is a type of transfer learning called fine-tuning. By freezing some of the earlier network parameters which contain low-level features and only concentrating on learning higher-level features, we can achieve a decent generalization. Hence, we can use our small manually gathered image set to let the optimizer improve a smaller number of parameters. The bad news is that we need to include the pre-trained classes’ original set with our custom image set during training, or this fine-tuning will significantly affect the generalization for the pre-trained classes. Hence, a way to come around this in the proposed system is to integrate two different object detection models. Model A will be a quantized pre-trained model identifying 80 different common object classes [20], shown in Fig. 5, with good generalization in different scenarios, and Model B will be a custom object detection model to serve for other objects that are not existing in the COCO labels including some hazards detection shown in Fig. 6.
After defining the labels needed for model B, it is time for gathering images. There are several datasets for object detection such as the CIFAR-100 [21], CALTECH-101 [22], PASCAL VOC [23], ImageNet [24], and a lot more on Kaggle [25]. Since deep learning models extract all features (representations) and patterns from the raw data, the quality of the data matters. Thus, bad data implies a bad-performing deep learning model. Images taken via mobile should be preprocessed to 1024 by 1024 JPEG images to remain containing the object details with relatively lower size as this will be better for minimizing the training time. The preprocessing software, we have used, to resize images and reduce the size of images is named Caesium [26]. There is also a website for jpg image compression called tiny jpg [27]. When collecting images using a mobile camera, we have tried to take them from various angles and lighting conditions so, that the model can generalize features for different conditions. Taking into consideration the data imbalance problem. Therefore, for model B, we took 60 images per class both gathered from fair-use google images and manually captured images via mobile phone. From which, we take 7 images as validation data. With a validation ratio of 11.67% of the total data at hand. Total images at hand divided like that 172 currency-class from Kaggle [28] (5Egp, 10Egp, 20Egp, 50Egp, 100Egp, 200Egp) + 60 bench + 60 door + 60 fire + 60 fire extinguisher + 60 recycle bin + 120 stairs (Up + Down) + 60 toggle switch + 60 wallet + 60 wet Floor Sign = 772 images. Later added 25 more fire images. This is a deliberate data imbalance for the fire class as it is the only class that performs poorly. So, it ends up with ~ 800 images for training, validation, and testing. After splitting, we created annotation files via a python software (tool) developed by Tzutalin called LabelImg [29]. Gathered images were manually annotated with this tool to produce extensible markup language, or shortly XML, annotation files. These files contain the answer to a particular image for the supervised training. They include image size, path, object names, and bounding boxes’ position. While annotating, we considered keeping the bounding box as tight as possible, therefore, the model can learn the feature of that specific object, not the surroundings.
Some optimization to utilize the resources was done such as using the multi-threading to improve the frames captured by the camera, counting the object decreasing time instead of hearing the same object name multiple times, the user hears the number of the objects, ultrasonic code is optimized to work with higher frequency if there is no object detected by either of models in the captured frame, and finally, minor optimizations replacing some logical statements with a single arithmetic operation which is huge for the logical statements repeated every frame. Considering the second block, mode 1, the simple face recognition mode. This mode is used to match human faces in the input video frame to a collection of defined pre-trained faces. Face detection includes separating faces from the background or clutter. It involves pre-processing for grayscale conversion and some filters help for front faces classification and then localizing the position of a bounding box of the faces detected in the image. A face recognition module is some method to identify or verify the identity of the detected face. Verification is done by testing the detected faces with known faces in the collected database.
Now, consider the third, mode 2, the reading text mode. This mode is essentially based on OCR which is sometimes referred to as text recognition. Concisely, using the open-source tesseract engine. Python-tesseract, or Pytesseract, is a tool in python which is a wrapper for the Tesseract-OCR Engine. This is the OCR tool used in the proposed aiding system. Utilizing this tool, an image or a video frame captured is analyzed returning a list of strings containing the words detected. Some pre-processing techniques may be used to improve the capability of the ORC of detecting words such as converting the image to grayscale, removing noise by adding some blur, dilation, erosion, applying canny edge detection, and skew correction. But unfortunately, we didn’t deploy these pre-processing techniques hence, the results on the video frame are not good enough on smaller text. Another factor that affects reading smaller text is the camera resolution. However, testing the mode on an image gives
In this paper, we have presented an offline real-time wearable attachment that is a glasses system for aiding blind and visually impaired people. This system includes a camera, an ultrasonic sensor, an embedded Raspberry Pi, a headset, and a battery. It is impossible to give vision to the blind; however, using computer vision and deep learning, the BVI can obtain information through a headset, either indoor or outdoor, about obstacles and situations in front of him during navigation. The BVI user can also recognize his family and friends, or read text and signboards. This helps BVI to be self-dependent for doing many activities safely and comfortably. Unfortunately, the system had not been experimented by users. However, some further future enhancements and recommendations can make this system more reliable. Fortunately, the system is configurable. Therefore, a lot of work can be done for the product to be released. Firstly, the detection subsystem, the reading text mode can have better software in scanning video frames utilizing image processing for several frames, to obtain good and accurate text detections. For face recognition mode, utilizing a mobile application connected with some sort of server, to store images for people to be included for recognition then the system can update over the air whenever it is connected to the internet. For object detection mode, improve model B, generally, by adding a large number of examples therefore, the model generalizes better. Another technique may be considered that is called Generative Adversarial Networks (or GANs) proposed by Ian Goodfellow in 2014 to synthesize or invent data. Training on new architectures for object detection should be considered and you may consider multi-frame SSD [33] for video object detection. Also, model B could be tuned via transfer learning utilizing the model maker API which does the training with an already quantized efficientDet lite model. Quantized weights will give good results than the proposed TF API training with double32 weights and then converting them to float16 or int8. For a reliable distance measuring system, use multiple ultrasonic sensors to create some sort of 3D visualization by involving the depth into the situation. Or you may consider using an RGB-D camera and train your models to estimate depth for every labeled object but this RGB-D camera sensor may increase the cost significantly. Raspberry Pi has a lot of spare I/O pins and multiple protocols for interfacing. Therefore, improving the system to communicate with other systems can be possible consider utilizing any kind of helpful sensor and using multiple feedback options. Decrease the weight of the glasses is a major concern so, the detaching boxes from the glasses' sides and using a communication protocol for wiring. In a real-time manner, dynamic channels (wireless communication) shouldn't be used. So, in my opinion, there should be wired channels between the different wearable attachments and processing units. These channels and devices should be either waterproof or can be detached from the wearables and hence, it can these wearables preferable to be treated as ordinary wearables. For harsh environments, there are special types of ultrasonic which can handle a lot of harder circumstances, especially for outdoor environments. | |||||||||||||||||||||||||||||||||||||||||||||||||||
References | |||||||||||||||||||||||||||||||||||||||||||||||||||
"Blindness and vision impairment," World Health Organization, [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/blindness-and-visual-impairment. [Accessed 14 October 2021].
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
P. Ackland, S. Resnikoff and B. Rupert, “World Blindness and Visual Impairment: Despite Many Successes, The Problem Is Growing,” Community Eye Health Journal, vol. XXX (30), no. 100, pp. 71-73, 2017.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
B. Kuriakose, R. Shrestha and E. S. Eika Sandnes, "Tools and Technologies for Blind and Visually Impaired Navigation Support: A Review Article," IETE Technical Review, vol. XXXIX (39), no. 1, pp. 3-18, 27 September 2020.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
M. Rabani Mohd Romlay, S. Fauziah Toha, A. Mohd Ibrahim and I. Venkat, "Methodologies and Evaluation of Electronic Travel Aids for the Visually Impaired People: A Review," Bulletin of Electrical Engineering and Informatics, vol. X (10), no. 3, pp. 1747-1758, June 2021.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
C. Ye, S. Hong, X. Qian and W. Wu, "Co-Robotic Cane: A New Robotic Navigation Aid for the Visually Impaired," IEEE Systems Man and Cybernetics Magazine, p. 33–42, April 2016.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
M. Helmy Abd Wahab, A. A. Talib, H. A. Kadir, A. Johari, A. Noraziah, R. M. Sidek and A. A.Mutalib, "Smart Cane: Assistive Cane for Visually-impaired People," IJCSI International Journal of Computer Science Issues, vol. VIII (8), no. 4, July 2011.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
"WeWalk smart cane," westminster technologies, [Online]. Available: https://www.westminstertech.com/products/wewalk-smart-cane?variant=31405927923814.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
S. B Kallara, M. Raj, R. Raju, N. Mathew, P. V R and D. DS, "Indriya - A Smart Guidance System for the Visually Impaired," IEEE Xplore Compliant, pp. 26-29, 23 November 2017.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
M. Rahman, M. M. Islam, S. Ahmmed and S. Khan, "Obstacle and Fall Detection to Guide the Visually Impaired People with Real Time Monitoring," SN Computer Science, 27 June 2020.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
M. A. Rahman and M. Sadi, "IoT Enabled Automated Object Recognition for the Visually Impaired," Elsevier: Computer Methods and Programs in Biomedicine Update, vol. I, 21 May 2021.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
Sensotec, "OneStep Reader (KNFB READER)," Developed for Android and IOS, 9 October 2015. [Online]. Available: https://apps.apple.com/us/app/onestep-reader/id849732663.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
Cloudsight, "TapTapSee," Developed for Android and IOS, 4 April 2014. [Online]. Available: https://play.google.com/store/apps/details?id=com.msearcher.taptapsee.android&hl=ar&gl=US.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
M. Douděra and Hayaku, "Cash Reader," Developed for Android and IOS, 26 February 2019. [Online]. Available: https://play.google.com/store/apps/details?id=com.martindoudera.cashreader&hl=ar&gl=US.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
Microsoft, "Seeing AI," Develobed only for IOS, 2021. [Online]. Available: https://apps.apple.com/us/app/seeing-ai/id999062298.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
Y. Bouteraa, "Design and Development of a Wearable Assistive Device Integrating a Fuzzy Decision Support System for Blind and Visually Impaired People," Micromachines, vol. XII (12), no. 9, 7 September 2021.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
H.-C. Wang, R. Katzschmann, S. Teng, B. Araki, L. Giarre and D. Rus, "Enabling Independent Navigation for Visually Impaired People through a Wearable Vision-Based Feedback System," IEEE International Conference on Robotics and Automation (ICRA), pp. 6533-6540, 29 May 2017.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu and A. Berg, "SSD: Single Shot MultiBox Detector," arXiv, 29 December 2016.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
T.-Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick and H. et al, “Microsoft COCO: Common Objects in Context,” arXiv, 21 February 2015.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
R. Joshi, S. Yadav, M. Dutta and C. Travieso-Gonzalez, "Efficient Multi-Object Detection and Smart Navigation Using Artificial Intelligence for Visually Impaired People," Entropy, vol. XXII (22), no. 9, 27 August 2020.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
E. EdjeElectronics, "TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi," GitHub, 13 December 2020. [Online]. Available: https://github.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/blob/master/Raspberry_Pi_Guide.md.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
A. Krizhevsky, V. Nair and G. Hinton, "CIFAR-100 (Canadian Institute for Advanced Research)," [Online]. Available: http://www.cs.toronto.edu/~kriz/cifar.html.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
N. Kamarudin, M. Makhtar, F. Syed Abdullah, M. Mohamad, F. Mohamad and M. F. Abdul Kadir, "Comparison of image classification techniques using caltech 101 dataset," Journal of Theoretical and Applied Information Technology, pp. 79-86, September 2015.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
M. Everingham, L. Gool, C. K. Williams, J. Winn and A. Zisserman, "The Pascal Visual Object Classes (VOC) Challenge," Int. J. Comput. Vision, vol. LXXXVIII (88), no. 2, p. 303–338, June 2010.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei, "ImageNet: A large-scale hierarchical image database," IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255, 2009.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
Kaggle, [Online]. Available: https://www.kaggle.com/.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
"Caesium Image Compressor - Great Image Compression Tool With High Flexibility," ArtiStudio, 2 November 2021. [Online]. Available: https://wiki.artistudio.xyz/docs/web-development/tools/image-compressor/caesium-image-compressor/.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
Tiny JPG, [Online]. Available: https://tinyjpg.com/.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
A. E. Egypt-Iris, R. Hisham, S. Tarek and M. ElKarargy, "Egyptian Currency," Kaggle, 1 August 2021. [Online]. Available: https://www.kaggle.com/datasets/egyptiris/egyptian-currency.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
Tzutalin, "LabelImg," GitHub, 2015. [Online]. Available: https://github.com/tzutalin/labelImg.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
“TensorFlow 2 Detection Model Zoo,” Github, [Online]. Available: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
N. Renotte, “Tensorflow Object Detection Walkthrough,” GitHub, 3 April 2021. [Online]. Available: https://github.com/nicknochnack/TFODCourse.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
Tim, "Face Recognition With Raspberry Pi and OpenCV," Core Electronics, 30 March 2022. [Online]. Available: https://core-electronics.com.au/guides/face-identify-raspberry-pi/.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
A. Broad and T.-Y. L. Michael Jones, "Recurrent Multi-frame Single Shot Detector for Video Object Detection," 2018.
|