Sunday, March 30, 2014

OpenCv, calculate average fps in python

After having left the app to run for a couple of hours, I noticed that the memory had more than doubled.

From 70MB to 150MB of "Real Memory" as Mac's Activity Monitor puts it.

What was happening is that I had created a list that was growing with each frame in order to calculate the average of the frames.

I decided to get rid of this, first creating a circular counter in Python using a generator that resets to 0 when reaches a max value. Then initiating a fixed-size list that the generator will fill the fps in a circular manner.

It looks like this:
def circular_counter(max):
    """helper function that creates an eternal counter till a max value"""
    x = 0
    while True:
        if x == max:
            x = 0
        x += 1
        yield x

class CvTimer(object):
    def __init__(self):
        self.tick_frequency = cv2.getTickFrequency()
        self.tick_at_init = cv2.getTickCount()
        self.last_tick = self.tick_at_init
        self.fps_len = 100
        self.l_fps_history = [ 10 for x in range(self.fps_len)]
        self.fps_counter = circular_counter(self.fps_len)
    def reset(self):
        self.last_tick = cv2.getTickCount()
    def get_tick_now(self):
        return cv2.getTickCount()
    @property
    def fps(self):
        fps = self.tick_frequency / (self.get_tick_now() - self.last_tick)
        self.l_fps_history[self.fps_counter.next() - 1] = fps
        return fps
    @property
    def avg_fps(self):
        return sum(self.l_fps_history) / float(self.fps_len)
In your frame-by-frame while loop:
#Timecv:
cv2.putText(self.a_frame, "fps=%s avg=%s" % (timer.fps, timer.avg_fps),
            (10, 75), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255,255,255))
And finally in the app window:

Behave, declaration of principles

The idea of controlling yourself in front of computer helped by an app was born around three years ago while cycling to work. It was a sunny morning, precious situation in Ireland, riding on Strawberry Beds road direction Leixlip.

It was clear that had to use machine learning techniques, computer vision and Python. 

Three years have passed, today there is a running app that does the basics of what I had in mind, and it works.

It works so well that I use it as I code, the more I use it, the more need to code it to improve it, the more I code it, the more I need it... an unexpected vicious circle.

Behave is open source and free, its aim is to help people, serve as support to maintain a healthier back and habits in front of a computer. 

Because I firmly believe that health should be a right and not a benefit nor a business, this software is and it will be free.

The dream is to reach schools where youngsters are starting to spend time in front of the screen and to prevent them from back problems, and other habits, as they grow up :)

That's it, it is written.

Saturday, March 29, 2014

Performance with opencv and python, fps reduced

So, I decided to sharpen the knifes and get real data on what affects performance in running Behave.

The problem: 

A refactor in python code using opencv, made the program running slower.

The refactor consisted in transforming basic code using functions, "Functional Code":
https://github.com/Kieleth/behave/commit/fda558d61c11720d346a107c174a052f97c2ee3b

Into more abstracted "Code with classes":
https://github.com/Kieleth/behave/commit/503e698dbcbea2c7224fae72c12a94832552c606

Difference was observed from +-12fps to +-5fps. Functionality is the same.

The weapons:

- Python performance tips:
https://wiki.python.org/moin/PythonSpeed/PerformanceTips
- Python profiler:
http://docs.python.org/2/library/profile.html
- Opencv optimisation guide:
http://answers.opencv.org/question/755/object-detection-slow/#760

Process:

First of all was to create a branch from the functional code in git, this allows to explore both possibilities and move between them.

Now, lets get some data:

Functional code:

It runs smoothly at around 10fps.

Added a break in the code, that is triggered after 20 seconds of running the app:

profile:
python2.7-32 -m profile -s cumulative capturebasic.py
         16009 function calls (15918 primitive calls) in 16.698 seconds
   Ordered by: cumulative time
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   16.698   16.698 profile:0(<code object <module> at 0x445a88, file "capturebasic.py", line 1>)
        1    0.041    0.041   16.697   16.697 capturebasic.py:1(<module>)
        1    0.253    0.253   16.465   16.465 capturebasic.py:10(launch)
      129    6.311    0.049    6.311    0.049 :0(waitKey)
      129    2.615    0.020    2.615    0.020 :0(imshow)
      129    2.554    0.020    2.554    0.020 :0(detectMultiScale)
      129    1.867    0.014    1.867    0.014 :0(read)
      129    1.124    0.009    1.124    0.009 :0(flip)
      129    0.688    0.005    0.688    0.005 :0(cvtColor)
      129    0.602    0.005    0.602    0.005 :0(equalizeHist)
        1    0.205    0.205    0.205    0.205 :0(VideoCapture)...
cProfile:
python2.7-32 -m cProfile -s cumulative capturebasic.py 
         16184 function calls (16093 primitive calls) in 20.487 seconds
   Ordered by: cumulative time
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.044    0.044   20.488   20.488 capturebasic.py:1(<module>)
        1    0.073    0.073   20.331   20.331 capturebasic.py:10(launch)
      141    8.175    0.058    8.175    0.058 {cv2.waitKey}
      141    3.450    0.024    3.450    0.024 {cv2.imshow}
      141    2.565    0.018    2.565    0.018 {method 'detectMultiScale' of 'cv2.CascadeClassifier' objects}
      141    1.903    0.013    1.903    0.013 {method 'read' of 'cv2.VideoCapture' objects}
        1    1.309    1.309    1.309    1.309 {cv2.VideoCapture}
      141    1.219    0.009    1.219    0.009 {cv2.flip}
      141    0.592    0.004    0.592    0.004 {cv2.equalizeHist}
      141    0.565    0.004    0.565    0.004 {cv2.cvtColor}
        1    0.303    0.303    0.303    0.303 {method 'release' of 'cv2.VideoCapture' objects}
        1    0.115    0.115    0.115    0.115 {cv2.CascadeClassifier}
Found:


Added a break in the code that exits the program after 100 frames:

Using linux "time":
#3 passes: 
time python2.7-32 capturebasic.py 
real 0m15.977s,       real 0m15.411s,       real 0m15.398s


Code with classes:

Running sluggish at 5fps.

Break at 20 seconds of running the app:
python2.7-32 -m cProfile -s cumulative capturebasic.py
         13488 function calls (13397 primitive calls) in 21.944 seconds
   Ordered by: cumulative time
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.160    0.160   21.946   21.946 capturebasic.py:1(<module>)
       52    0.001    0.000    8.801    0.169 classifiers.py:34(__init__)
       52    0.003    0.000    8.800    0.169 classifiers.py:4(__init__)
       52    8.797    0.169    8.797    0.169 {cv2.CascadeClassifier}
       52    0.001    0.000    5.382    0.104 gui.py:17(should_quit)
       52    5.382    0.103    5.382    0.103 {cv2.waitKey}
       52    1.768    0.034    1.768    0.034 {cv2.imshow}
       52    0.001    0.000    1.679    0.032 classifiers.py:21(detect_multiscale)
       52    1.678    0.032    1.678    0.032 {method 'detectMultiScale' of 'cv2.CascadeClassifier' objects}
        1    1.178    1.178    1.178    1.178 {cv2.VideoCapture}
       52    0.952    0.018    0.952    0.018 {method 'read' of 'cv2.VideoCapture' objects}
       52    0.002    0.000    0.838    0.016 utils.py:15(prepare_frame_for_detection)
       52    0.000    0.000    0.714    0.014 utils.py:11(flip_frame)
       52    0.714    0.014    0.714    0.014 {cv2.flip}
       52    0.423    0.008    0.423    0.008 {cv2.cvtColor}
       52    0.413    0.008    0.413    0.008 {cv2.equalizeHist}
        1    0.330    0.330    0.330    0.330 {method 'release' of 'cv2.VideoCapture' objects}
        1    0.003    0.003    0.107    0.107 __init__.py:106(<module>)
        1    0.000    0.000    0.050    0.050 add_newdocs.py:9(<module>)
        2    0.003    0.002    0.042    0.021 __init__.py:1(<module>)
        1    0.002    0.002    0.037    0.037 __init__.py:15(<module>)
        2    0.007    0.004    0.029    0.014 __init__.py:2(<module>)
        1    0.000    0.000    0.028    0.028 type_check.py:3(<module>)
       52    0.001    0.000    0.016    0.000 gui.py:11(display_faces)
Found:

  • "{cv2.CascadeClassifier}" is executed 52 times, that is, every frame, whereas in the functional code is only executed 1.

Break after 100 frames:
time python2.7-32 capturebasic.py
real 0m27.675s,     real 0m25.162s,      real 0m26.867s

Solution:

In the refactored code I was instantiating the face detector "FaceClassifier" class in every pass of the loop:
while(True):
    ...
    face_classifier = FaceClassifier()
    faces_list = face_classifier.detect_multiscale(frame_prepared)
Taking it out of the while:
face_classifier = FaceClassifier()
while(True):
    ...
    faces_list = face_classifier.detect_multiscale(frame_prepared)
And fps are back on track:
Cleaned up camera.
         18736 function calls (18645 primitive calls) in 21.654 seconds
   Ordered by: cumulative time
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.092    0.092   21.656   21.656 capturebasic.py:1(<module>)
      168    0.008    0.000    9.017    0.054 gui.py:17(should_quit)
      168    9.008    0.054    9.008    0.054 {cv2.waitKey}
      168    3.317    0.020    3.317    0.020 {cv2.imshow}
      168    0.020    0.000    2.745    0.016 classifiers.py:21(detect_multiscale)
      168    2.725    0.016    2.725    0.016 {method 'detectMultiScale' of 'cv2.CascadeClassifier' objects}
      168    2.077    0.012    2.077    0.012 {method 'read' of 'cv2.VideoCapture' objects}
      168    0.001    0.000    1.322    0.008 utils.py:11(flip_frame)
      168    1.320    0.008    1.320    0.008 {cv2.flip}
      168    0.015    0.000    1.277    0.008 utils.py:15(prepare_frame_for_detection)
        1    1.165    1.165    1.165    1.165 {cv2.VideoCapture}
      168    0.648    0.004    0.648    0.004 {cv2.equalizeHist}
      168    0.613    0.004    0.613    0.004 {cv2.cvtColor}
        1    0.332    0.332    0.332    0.332 {method 'release' of 'cv2.VideoCapture' objects}
        1    0.002    0.002    0.104    0.104 __init__.py:106(<module>) 
Timings as well:
time python2.7-32 capturebasic.py
real 0m14.860s,    real 0m17.444s,    real 0m13.883s
This is a bug, but a good one, since it has forced me to dig into performance tools used with opencv. I'm pretty sure this will not be the last time to have to use them in the project.


Thursday, March 27, 2014

FPS reduced!

A nasty surprise, a refactor in the code to move it from exploratory-functional to class-abstracted, has ended up with a reduction on fps in the capture.

In simple terms, the exploratory code which was a while loop and some processing inside was giving averages of 12fps. 

The "better" code, with classes, modular, extensible, cleaner, etc, gives 5fps.

This means that class interaction in the python code affects performance.

Optimisation is needed, some links to follow:



Sunday, March 23, 2014

Viola-Jones, Haar features, Cascades, plan.

A bit of theory:
http://en.wikipedia.org/wiki/Viola%E2%80%93Jones_object_detection_framework
http://en.wikipedia.org/wiki/AdaBoost

Slides:
http://www.slideshare.net/wolf/avihu-efrats-viola-and-jones-face-detection-slides/

Viola-Jones explained:
https://sites.google.com/site/5kk73gpu2012/assignment/viola-jones-face-detection#TOC-Image-Pyramid

Integral Images
http://en.wikipedia.org/wiki/Summed_area_table
Haar:http://en.wikipedia.org/wiki/Haar-like_features

Good to understand.
http://docs.opencv.org/trunk/doc/py_tutorials/py_objdetect/py_face_detection/py_face_detection.html

Future, for training the cascades:
http://docs.opencv.org/doc/user_guide/ug_traincascade.html

The nicety is that since this is a working prototype, is controlling my back even as I write this post.

What are next steps?

1. Incorporate GUI:
  - selection of threshold to check face position (drag and drop?)
  - customization of alerts when alarm is triggered (idea is to drop some scripts in folder, to make it extensible.
  - Turn camera on-off to improve performance.
  - Menu to select tweaks in performance (investigate how to show real FPS)
    - face detection tweaks
    - general camera capture tweaks.
    - Auto-find best setup for the computer.

2. Add nail bitting checking.

3. Make it installable (bundle opencv-required libs inside the project).

Future:
- "Teaching" option, so user tells Behave to learn things to detect.

Behaving

Digging into opencv and theory behind object detection/recognition.

Project to detect hand gestures, great stuff.
https://www.youtube.com/watch?v=8Vr08EYBN04

And, face detection using Haar cascades explained:
https://www.youtube.com/watch?v=sWTvK72-SPU

Saturday, March 22, 2014

Behave, alive.

After some months of proper procrastination, finally Behave has a raw working version:

https://github.com/Kieleth/behave

I decided the re-do the version I had previously, it was giving problems with latency in the capture and image processing. So, I cut short and simplified as much as I could.

Also the fact that OpenCv documentation online is back, helped a lot, some nice articles for basics:

http://docs.opencv.org/trunk/doc/py_tutorials/py_gui/py_video_display/py_video_display.html

http://docs.opencv.org/trunk/doc/py_tutorials/py_objdetect/py_face_detection/py_face_detection.html

At the moment it runs smoothly thanks to some minor tweaks in the face-detection call:

    #Recognition:
    scale_factor = 1.3
    min_neigh = 4
    flags = cv2.CASCADE_SCALE_IMAGE
    minSize = (200, 200)

    faces = face_cascade.detectMultiScale(gray, scale_factor, min_neigh,minSize=minSize, flags=flags)

Where:

    cv2.cv.CV_HAAR_SCALE_IMAGE: Scales each windowed image region to match the feature data. (The default approach is the opposite: scale the feature data to match the window.) Scaling the image allows for certain optimizations on modern hardware. This flag must not be combined with others.

  • scaleFactor – Parameter specifying how much the image size is reduced at each image scale.
  • minNeighbors – Parameter specifying how many neighbors each candidate rectangle should have to retain it.
  • flags – Parameter with the same meaning for an old cascade as in the function cvHaarDetectObjects. It is not used for a new cascade.
  • minSize – Minimum possible object size. Objects smaller than that are ignored.
  • maxSize – Maximum possible object size. Objects larger than that are ignored

The idea is to find optimal parameters automatically using frame times, but that's another day's work :D

I'll be working in adding functionality and cleaning up the first version of the code (in exploratory mode still).



"Bus error: 10" Cont.


It seems the best option is to make cv again, so, lets get the latest version:

# in a nice place you want to do your clone:
git clone https://github.com/Itseez/opencv.git opencv
cd opencv
mkdir makethis; cd makethis 
 # select 2.4 branch:
Luiss-iMac:opencv Luis$ git branch -a
* master
  remotes/origin/2.4
  remotes/origin/HEAD -> origin/master
  remotes/origin/master
Luiss-iMac:opencv Luis$ git checkout 2.4
#This installs the 32 bit version:
cmake -G "Unix Makefiles" -D CMAKE_OSX_ARCHITECTURES=i386 -D CMAKE_C_FLAGS=-m32 -D CMAKE_CXX_FLAGS=-m32 -D CMAKE_INSTALL_PREFIX=/usr/local -D PYTHON_EXECUTABLE:FILEPATH=/usr/local/bin/python2.7-32 -D INSTALL_PYTHON_EXAMPLES:BOOL=ON .. 
# for the 64 version:
cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local PYTHON_EXECUTABLE:FILEPATH=/usr/local/bin/python2.7 -D INSTALL_PYTHON_EXAMPLES:BOOL=ON ..  
make -j8
#Go a get a beer... 

 Success looks like:
BUILD SUCCESSFUL
Total time: 5 seconds
[100%] Built target opencv_test_java 
sudo make install
Now, you might think that it's done:

Luiss-iMac:make5 Luis$ python2.7-32
Python 2.7.5 (v2.7.5:ab05e7dd2788, May 13 2013, 13:18:45)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/site-packages/cv.py", line 1, in <module>
    from cv2 import *
ImportError: dlopen(/usr/local/lib/python2.7/site-packages/cv2.so, 2): Symbol not found: __ZN2cv10PCAComputeERKNS_11_InputArrayERKNS_12_OutputArrayES5_d
  Referenced from: /usr/local/lib/python2.7/site-packages/cv2.so
  Expected in: lib/libopencv_core.3.0.dylib
 in /usr/local/lib/python2.7/site-packages/cv2.so
What's wrong now? well, this is an old fight I had with opencv, I managed to repel the enemy by:

          export DYLD_LIBRARY_PATH=/usr/opencv/lib/ 

That is, tell the Dynamic Linker

https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man1/dyld.1.html


Where to find the libs needed for cv2.so, how to find it?:

Luiss-iMac:site-packages Luis$ otool -L cv2.so

\cv2.so:

     cv2.so (compatibility version 0.0.0, current version 0.0.0)

     /System/Library/Frameworks/Python.framework/Versions/2.7/Python (compatibility version 2.7.0, current version 2.7.2)
     lib/libopencv_core.2.4.dylib (compatibility version 2.4.0, current version 2.4.7)

After this (you can include the env in your profile), and remember to use python-32 if you have installed OpenCV in 32 bits, you can use cv again:

Luiss-iMac:site-packages Luis$ python-32
Python 2.7.5 (v2.7.5:ab05e7dd2788, May 13 2013, 13:18:45)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv
>>>


EDIT:
An updated version on how to (re-)install OpenCv and/or set your DYLD_LIBRARY_PATH can be found here.