Neil Hopcroft

A digital misfit

Build break in of-video #210

The of-video project has failed with these errors:

In file included from AVHandler.cc:25:0:
AVHandler.h:244:12: error: expected ‘;’ at end of member declaration
AVFrame *create_frame(PixelFormat fmt);
^
AVHandler.h:244:37: error: expected ‘)’ before ‘fmt’
AVFrame *create_frame(PixelFormat fmt);
^
AVHandler.cc: In member function ‘int AVHandler::setup_write()’:
AVHandler.cc:117:39: error: ‘CODEC_ID_NONE’ was not declared in this scope
av_output->oformat->audio_codec = CODEC_ID_NONE;
^
AVHandler.cc:146:49: error: expression cannot be used as a function
frame = create_frame(vstream->codec->pix_fmt);
^
AVHandler.cc:147:29: error: ‘PIX_FMT_RGB24’ was not declared in this scope
rgbframe = create_frame(PIX_FMT_RGB24);
^
AVHandler.cc:147:42: error: expression cannot be used as a function
rgbframe = create_frame(PIX_FMT_RGB24);
^

These are some more things missing their AV prefix following some configuration changes. I have made some local fixes for these.
Fixed in build #220.


Adding yocto project

There seems to be a number of people using Yocto Project for their embedded systems builds, indeed, I have experimented with it for the Intel Galileo board I dabbled with a couple of years ago.
To add this to the CI build server I am going to use the Jethro branch. First step is to to clone the code and manually try some builds from the command line. This gives me an error:

libsdl-native is set to be ASSUME_PROVIDED but sdl-config can’t be found in PATH. Please either install it, or configure qemu not to require sdl.

I will add libsdl to the CI build server and see if that resolves the problem – it doesn’t. The latest code there installs libSDL2 and sdl2-config, which doesn’t resolve the dependency problem. Instead removing the SDL lines from the yocto/build/conf/local.conf file does resolve the problem.
The next problem is the sheer size of the build – it is, after all, a system to build a complete image – I aborted the ‘bitbake world’ build at about 20% completion when it threatened to consume all available disk space. ‘bitbake core-image-minimal’ is probably manageable. Then there is the question of how long a build will take – the 20% of the ‘world’ build had taken nearly 2 days, so it isn’t something we want to do clean rebuilds of on a regular basis.
Now the build is failing with

build/genattrtab /usr/share/tomcat7/.jenkins/workspace/yocto/build/tmp/work-shared/gcc-5.2.0-r0/gcc-5.2.0/gcc/common.md /usr/share/tomcat7/.jenkins/workspace/yocto/build/tmp/work-shared/gcc-5.2.0-r0/gcc-5.2.0/gcc/config/i386/i386.md insn-conditions.md \
-Atmp-attrtab.c -Dtmp-dfatab.c -Ltmp-latencytab.c
make[1]: *** [s-attrtab] Killed

A quick search reveals that this is likely caused by running out of memory. Applying the suggested fix there, to add some swap space, resolves this problem.
Now the build runs to completion, taking 37 hours to complete even though I had already allowed 200-odd of the tasks to run in a shell.
The next question is how updates occur, I have launched a few more builds since this completion and there have been no changes, leaving the build with no tasks requiring rerunning. I’ll put this on a weekly rebuild schedule and see what happens.


Resurrecting Where’s Neil? and the OBD data logger

A little experimenting with the bluetooth OBD units in my car has shown that one of them hasn’t blown yet – I thought my car had a tendency to overvoltage and smoke them, at least a couple have given up their magic smoke.
This means there is potential to resurrect the “Where’s Neil?” web page (which might become password protected at some point, to maintain my privacy – I’ll give you a password if I’m coming to visit you). Next though, I need to get the data connection working again – my PAYG SIM has expired. My old Android phone, the one with the OBD software on, is SIM locked to Orange, but I have a Mifi dongle that is not tied to any particular network. I spoke to someone in a phone shop who was absolutely sure they didn’t do data SIMs and that a normal voice SIM wouldn’t work in a data dongle. But I bought a new EE voice SIM anyway, thinking that it would work with my old Android phone even if it didn’t work with the dongle. Turns out it is the other way around, it is fine in the dongle not in the phone. So now I have a chain that looks like this:

Engine <=> CAN Bus <=> OBD reader <=> Bluetooth <=> Phone <=> Wifi <=> Dongle <=> GSM <=> Webserver

Nothing can go wrong here, surely.
Actually, it does work, although it is a bit cumbersome. So the page is live again, although it will only be used occasionally since my car will run its battery down if I forget to remove the OBD unit – I have put a new heavy duty battery in it, but I still worry. Although a quick search on Amazon reveals some small Lithium Ion battery packs that will provide enough power for a jump start, so it might be worth investing in something like that, in case of emergency.


Build break – of-fem-fenics build #156

Build #156 of of-fem-fenics is giving this error:

CPPFLAGS=”-std=c++11 -fopenmp -DDOLFIN_VERSION=\”1.7.0dev\” -DNDEBUG -DDOLFIN_SIZE_T=8 -DDOLFIN_LA_INDEX_SIZE=4 -DHAS_PETSC -DENABLE_PETSC_TAO -DHAS_UMFPACK -DHAS_CHOLMOD -DHAS_PARMETIS -DHAS_ZLIB -DHAS_CPPUNIT -DHAS_MPI -DHAS_OPENMP -DHAS_QT4 -DHAS_VTK -DHAS_QVTK -I/usr/local/include -I/usr/local/include/vtk-6.3 -I/usr/local/Trolltech/Qt-4.8.6/include -I/usr/local/include/eigen3 -DLATEST_DOLFIN” /usr/local/bin/mkoctfile -c Mesh.cc -o Mesh.o
In file included from /usr/local/include/dolfin/function/FunctionSpace.h:37:0,
from /usr/local/include/dolfin/function/dolfin_function.h:10,
from /usr/local/include/dolfin.h:26,
from mesh.h:21,
from Mesh.cc:19:
/usr/local/include/dolfin/fem/FiniteElement.h: In member function ‘void dolfin::FiniteElement::map_from_reference_cell(double*, const double*, const ufc::cell&) const’:
/usr/local/include/dolfin/fem/FiniteElement.h:184:21: error: ‘const class ufc::finite_element’ has no member named ‘map_from_reference_cell’
_ufc_element->map_from_reference_cell(x, xhat, c);
^
/usr/local/include/dolfin/fem/FiniteElement.h: In member function ‘void dolfin::FiniteElement::map_to_reference_cell(double*, const double*, const ufc::cell&) const’:
/usr/local/include/dolfin/fem/FiniteElement.h:192:21: error: ‘const class ufc::finite_element’ has no member named ‘map_to_reference_cell’
_ufc_element->map_to_reference_cell(xhat, x, c);
^
make: *** [Mesh.o] Error 1

This build was, like the recent dolfin build, made following ffc build #45, among a number of other upstream project changes. I’m hoping that resolving the dolfin build errors will also resolve this build.
The dolfin build, while still erroring, has resolved the problems I suspected to come from this change without resolving this error, further investigation is required to resolve this.
Indeed, it turns out that fixing the dolfin build errors also fixes these.
Fixed in build #239.


State of the nation: things are looking up

For the first time since I started running the build server in earnest there are no red dots showing next to any of the projects. I have disabled a few projects, mostly because they are not used by the target projects I am interested in and because they have suffered some kind of build error which will take some tracking down.
There are still some test failures, showing as yellow dots for unstable builds, notably in the event locator project. This project needs revisiting and completely refactoring at some point, but first I need to dig out a development desktop machine with enough memory and disk space to run the development environment.
Now we have a good baseline from which any failures can be investigated and resolved.
The build queue is now pretty constantly full, so this is really too small a machine for the builds I am trying to run, but since I am only running them ‘for fun’ and not directly toward any business goal there isn’t a pressing desire to upgrade the AWS instance, besides, I have paid for three years of a heavy utilisation instance at this level, so I may as well keep it running doing something.
One factor making the build queue so full is that the octave project triggers builds of all the of- projects upon completion, so every change to octave causes 70-something items to be added to the queue – I could reduce the impact of this by turning the SCM polling frequency down, it is currently @hourly, which I suspect launches a lot of builds which could just as well be done @daily, especially given the build itself takes 12 hours.


Build break – obd android #42

Build #42 of the obd android project failed with

A problem occurred configuring root project ‘obd android’.
failed to find Build Tools revision 23.0.1

I currently have v22.0.1 installed, so time to update the Android SDK.
Running ‘tools/android update sdk –no-ui’ doesn’t update my build tools, despite going through some of the downloading/updating process. I did, however, find some instructions for installing different versions of build tools. I have now added 23.0.1 to my build-tools/ directory and launched a new build.
For some reason this hasn’t worked, so I have switched the build to use 22.0.1, the currently installed version. This allows the build to complete.
Fixed in build #53.


More on the dolfin build breaks

Time to roll up my sleeves and get my hands dirty in the code following my previous chasing of various build errors.
The build error I am seeing now is

/usr/share/tomcat7/.jenkins/workspace/dolfin/build/dolfin/swig/modules/mesh/modulePYTHON_wrap.cxx: In function ‘PyObject* _wrap_MeshGeometry_set(PyObject*, PyObject*)’:
/usr/share/tomcat7/.jenkins/workspace/dolfin/build/dolfin/swig/modules/mesh/modulePYTHON_wrap.cxx:18767:60: error: no matching function for call to ‘dolfin::MeshGeometry::set(std::size_t&, const std::vector&)’
(arg1)->set(arg2,(std::vector< double > const &)*arg3);
^

This is in the swig autogenerated file modulePYTHON_wrap.cxx, but I can’t work out why it is trying to use them. Disabling the Python wrappers in the cmake options allows the build to complete.
While looking at this, I noticed that cmake is complaining that PetSc was built without cusp support, so PetSc support is disabled. Looking further into it, cusp requires a Cuda compatible GPU, which is not available on the AWS cloud (leastwise, not on the machine I am using there, it might be on the chunkier machines, but I doubt it). So I am going to have to live without Petsc support in dolfin.
This also means that I can disable the petsc, petsc4py, slepc and slepc4py projects, since they are only used by dolfin.
The build is finally fixed in build #506.


Fixing of-fixed

The fixed package is the most complex of the packages to get installed. There has been some fairly significant code rot since its last release and it doesn’t build any more. There are 55 errors in my first attempt to build it.

/usr/local/include/octave-4.1.0+/octave/../octave/oct-cmplx.h:49:10: error: no match for ‘operator==’ (operand types are ‘const volatile FixedPoint’ and ‘const volatile FixedPoint’)
if (ax == bx) \
^

There are some problems with complex comparisons and HDF5 configuration. Fixing these at least gets us a build that completes, however we are still stuck with a problem during installation:

“‘dispatch’ undefined near line 2 column 1
error: called from ‘/usr/share/tomcat7/octave/fixed-0.7.10/PKG_ADD’ in file /usr/share/tomcat7/octave/fixed-0.7.10/PKG_ADD near line 2, column 1”

Looking inside fixed.cc and fsort.m there are some requests to include calls to dispatch() in the PKG_ADD file – disabling these resolves the undefined dispatch error message.
Now I see

/usr/share/tomcat7/octave/fixed-0.7.10/x86_64-unknown-linux-gnu-api-v50+/fixed.oct: failed to load: /usr/share/tomcat7/octave/fixed-0.7.10/x86_64-unknown-linux-gnu-api-v50+/fixed.oct: undefined symbol: _ZNK5ArrayI17FixedPointComplexE17resize_fill_valueEv

Which is “Array::resize_fill_value() const” missing – this is probably something I commented out in Array-f.cc to get that to build. Lets try putting that back in now. This brings us back to the original problem.
By some magic, this was fixed in build #133, which I think reflects some changes to octave to undo an incompatible change – following this change it might be possible to undo some of the ‘get-it-to-build’ fixes I have made which may have removed or broken functionality.


Build break – dolfin build #338

Build #338 of dolfin failed with this error:

[ 98%] Building CXX object dolfin/swig/modules/function/CMakeFiles/_function.dir/modulePYTHON_wrap.cxx.o
/usr/share/tomcat7/.jenkins/workspace/dolfin/build/dolfin/swig/modules/function/modulePYTHON_wrap.cxx: In function ‘PyObject* _wrap_Function_non_matching_eval(PyObject*, PyObject*)’:
/usr/share/tomcat7/.jenkins/workspace/dolfin/build/dolfin/swig/modules/function/modulePYTHON_wrap.cxx:11623:41: error: ‘const class dolfin::Function’ has no member named ‘non_matching_eval’
((dolfin::Function const *)arg1)->non_matching_eval(*arg2,(dolfin::Array< double > const &)*arg3,(ufc::cell const &)*arg4);
^

Which seems like it is probably caused by this commit:

Commit 88a38a716b887b9d8d7ba0d5ce9e33a410907bbf by gnw20
Remove deprecated functions slated for removal with 1.7 release.

Then, in build #340 the error becomes:

CMake Error at CMakeLists.txt:889 (message):
Generation of form files failed:

Traceback (most recent call last):

File “/usr/local/bin/ffc”, line 213, in
sys.exit(main(sys.argv[1:]))
File “/usr/local/bin/ffc”, line 187, in main
compile_form(ufd.forms, ufd.object_names, prefix, parameters)
File “/usr/local/lib64/python2.7/site-packages/ffc/compiler.py”, line 154, in compile_form
analysis = analyze_forms(forms, parameters)
File “/usr/local/lib64/python2.7/site-packages/ffc/analysis.py”, line 67, in analyze_forms
unique_elements = sort_elements(unique_elements)
File “/usr/local/lib/python2.7/site-packages/ufl/algorithms/analysis.py”, line 205, in sort_elements
sorted_elements = topological_sorting(nodes, edges)
File “/usr/local/lib/python2.7/site-packages/ufl/utils/sorting.py”, line 36, in topological_sorting
S = nodes[:]

TypeError: ‘set’ object has no attribute ‘__getitem__’

Traceback (most recent call last):

File “/usr/share/tomcat7/.jenkins/workspace/dolfin/cmake/scripts/generate-form-files”, line 80, in
raise RuntimeError(“Unable to compile form: %s/%s” % (root, f))

RuntimeError: Unable to compile form: dolfin/ale/Poisson2D.ufl

This build was caused by completion of ffc build #45, looking at the changelog for ffc, this change is the most likely candidate:

Commit bd892c555ed4f751e7dc149c15b2b0799da75a68 by martinal
Fixes and simplifications of element analysis.

I notice the ffc build doesn’t include a clean step, so trying to add that before investigating further.
This error was fixed in build #352, but I can’t see a specific change that would have caused the fix – this always worries me, having something magically fix itself without really understanding why. But I’m not going to investigate too much further – there were a lot of upstream changes feeding into this build, so it could be any one of those.
However, we are still left with a further error:

/usr/share/tomcat7/.jenkins/workspace/dolfin/build/dolfin/swig/modules/function/modulePYTHON_wrap.cxx: In function ‘PyObject* _wrap_Function_non_matching_eval(PyObject*, PyObject*)’:
/usr/share/tomcat7/.jenkins/workspace/dolfin/build/dolfin/swig/modules/function/modulePYTHON_wrap.cxx:11551:41: error: ‘const class dolfin::Function’ has no member named ‘non_matching_eval’
((dolfin::Function const *)arg1)->non_matching_eval(*arg2,(dolfin::Array< double > const &)*arg3,(ufc::cell const &)*arg4);
^
make[2]: *** [dolfin/swig/modules/function/CMakeFiles/_function.dir/modulePYTHON_wrap.cxx.o] Error 1

This changes in build #362, to become:

/usr/share/tomcat7/.jenkins/workspace/dolfin/build/dolfin/swig/modules/mesh/modulePYTHON_wrap.cxx:27303:64: error: no matching function for call to ‘dolfin::Cell::get_vertex_coordinates(double*&) const’
((dolfin::Cell const *)arg1)->get_vertex_coordinates(arg2);
^
/usr/share/tomcat7/.jenkins/workspace/dolfin/build/dolfin/swig/modules/mesh/modulePYTHON_wrap.cxx:27303:64: note: candidate is:
In file included from /usr/share/tomcat7/.jenkins/workspace/dolfin/dolfin/function/FunctionSpace.h:38:0,
from /usr/share/tomcat7/.jenkins/workspace/dolfin/build/dolfin/swig/modules/mesh/modulePYTHON_wrap.cxx:3862:
/usr/share/tomcat7/.jenkins/workspace/dolfin/dolfin/mesh/Cell.h:338:10: note: void dolfin::Cell::get_vertex_coordinates(std::vector&) const
void get_vertex_coordinates(std::vector& coordinates) const
^
/usr/share/tomcat7/.jenkins/workspace/dolfin/dolfin/mesh/Cell.h:338:10: note: no known conversion for argument 1 from ‘double*’ to ‘std::vector&’
make[2]: *** [dolfin/swig/modules/mesh/CMakeFiles/_mesh.dir/modulePYTHON_wrap.cxx.o] Error 1

And, finally, in build #366 to become:

Linking CXX shared library libdolfin.so
/usr/bin/ld: cannot find -lvtkftgl
/usr/bin/ld: cannot find -lvtkftgl
collect2: error: ld returned 1 exit status

Which looks to be something missing from the vtk build. Again, vtk doesn’t include a clean step (why is this missing from the default Jenkins project setup?), so I’m going to try adding that and rebuilding before I do any further investigating.
Build #375 introduces:

[ 79%] Building CXX object dolfin/CMakeFiles/dolfin.dir/fem/FiniteElement.cpp.o
/usr/share/tomcat7/.jenkins/workspace/dolfin/dolfin/fem/FiniteElement.cpp: In member function ‘void dolfin::FiniteElement::tabulate_dof_coordinates(boost::multi_array&, const std::vector&, const dolfin::Cell&) const’:
/usr/share/tomcat7/.jenkins/workspace/dolfin/dolfin/fem/FiniteElement.cpp:52:17: error: ‘const class ufc::finite_element’ has no member named ‘tabulate_dof_coordinates’
_ufc_element->tabulate_dof_coordinates(coordinates.data(),
^
make[2]: *** [dolfin/CMakeFiles/dolfin.dir/fem/FiniteElement.cpp.o] Error 1

Which is fixed again in #376.
Then #380 gives a lot more library problems:

Linking CXX shared library libdolfin.so
/usr/bin/ld: cannot find -lvtkGUISupportQtOpenGL
/usr/bin/ld: cannot find -lvtkRenderingOpenGL
/usr/bin/ld: cannot find -lvtkRenderingLIC
/usr/bin/ld: cannot find -lvtkgl2ps
/usr/bin/ld: cannot find -lvtkRenderingContextOpenGL
/usr/bin/ld: cannot find -lvtkRenderingVolumeOpenGL
/usr/bin/ld: cannot find -lvtkftgl
/usr/bin/ld: cannot find -lvtkRenderingGL2PS
/usr/bin/ld: cannot find -lvtkGUISupportQtOpenGL
/usr/bin/ld: cannot find -lvtkRenderingOpenGL
/usr/bin/ld: cannot find -lvtkRenderingLIC
/usr/bin/ld: cannot find -lvtkgl2ps
/usr/bin/ld: cannot find -lvtkRenderingContextOpenGL
/usr/bin/ld: cannot find -lvtkRenderingVolumeOpenGL
/usr/bin/ld: cannot find -lvtkftgl
/usr/bin/ld: cannot find -lvtkRenderingGL2PS
collect2: error: ld returned 1 exit status

These were introduced by changes in vtk build #92. The most likely candidate for this error is “Commit 7bb212d348f7e913edc6504b14fbec797ee44f78 FindPythonLibs Py3k fixes from cmake master.” which alters library and include directories for python 3, but we’re using 2.7.
Build #431 introduces another error:

CMake Error at CMakeLists.txt:883 (message):
Generation of form files failed:

Traceback (most recent call last):

File “/usr/local/bin/ffc”, line 213, in
sys.exit(main(sys.argv[1:]))
File “/usr/local/bin/ffc”, line 187, in main
compile_form(ufd.forms, ufd.object_names, prefix, parameters)
File “/usr/local/lib64/python2.7/site-packages/ffc/compiler.py”, line 174, in compile_form
wrapper_code = generate_wrapper_code(analysis, prefix, object_names, parameters)
File “/usr/local/lib64/python2.7/site-packages/ffc/wrappers.py”, line 40, in generate_wrapper_code
return _generate_dolfin_wrapper(analysis, prefix, object_names, parameters)
File “/usr/local/lib64/python2.7/site-packages/ffc/wrappers.py”, line 47, in _generate_dolfin_wrapper
(capsules, common_space) = _encapsulate(prefix, object_names, analysis, parameters)
File “/usr/local/lib64/python2.7/site-packages/ffc/wrappers.py”, line 80, in _encapsulate
(i, form_data) in enumerate(form_datas)]
File “/usr/local/lib64/python2.7/site-packages/ffc/wrappers.py”, line 100, in _encapsule_form
make_classname(prefix, “form”, i),

NameError: global name ‘make_classname’ is not defined

Which was probably introduced by changes in ffc build #65 and persists until build #441, when we return to the missing vtk libraries error.
Build #468 again introduces what appears to be an ffc error:

CMake Error at CMakeLists.txt:883 (message):
Generation of form files failed:

Traceback (most recent call last):

File “/usr/local/bin/ffc”, line 213, in
sys.exit(main(sys.argv[1:]))
File “/usr/local/bin/ffc”, line 187, in main
compile_form(ufd.forms, ufd.object_names, prefix, parameters)
File “/usr/local/lib64/python2.7/site-packages/ffc/compiler.py”, line 159, in compile_form
ir = compute_ir(analysis, prefix, parameters)
File “/usr/local/lib64/python2.7/site-packages/ffc/representation.py”, line 108, in compute_ir
for (i, fd) in enumerate(form_datas)]
File “/usr/local/lib64/python2.7/site-packages/ffc/representation.py”, line 381, in _compute_form_ir
ir[“max_%s_subdomain_id” % integral_type] = form_data.max_subdomain_ids.get(integral_type, 0)

AttributeError: ‘FormData’ object has no attribute ‘max_subdomain_ids’

Which remains in build #472.
To attempt to resolve the link errors for vtk libraries, I have made a number of symlinks removing the version numbers from the library names, but I can’t test this until the most recent ffc error has been resolved.
The symlinks resolve the link errors in build #475, but I’m left with some further errors:

/usr/share/tomcat7/.jenkins/workspace/dolfin/build/dolfin/swig/modules/mesh/modulePYTHON_wrap.cxx: In function ‘PyObject* _wrap_CellType_refine_cell(PyObject*, PyObject*)’:
/usr/share/tomcat7/.jenkins/workspace/dolfin/build/dolfin/swig/modules/mesh/modulePYTHON_wrap.cxx:16065:41: error: ‘const class dolfin::CellType’ has no member named ‘refine_cell’
((dolfin::CellType const *)arg1)->refine_cell(*arg2,*arg3,*arg4);
^
/usr/share/tomcat7/.jenkins/workspace/dolfin/build/dolfin/swig/modules/mesh/modulePYTHON_wrap.cxx: In function ‘PyObject* _wrap_MeshGeometry_x__SWIG_0(PyObject*, int, PyObject**)’:
/usr/share/tomcat7/.jenkins/workspace/dolfin/build/dolfin/swig/modules/mesh/modulePYTHON_wrap.cxx:18402:47: error: lvalue required as unary ‘&’ operand
result = (double *) &(arg1)->x(arg2,arg3);
^

I’ll tackle this in another post – this has gotten long enough already.


More on oftests hanging

So it seems that updating SymPy hasn’t resolved the hang problem. So I have now added a workaround to skip the hanging test. This should at least get the tests running to completion again and producing results – the skipped test will show up as an error-ing test in the results.
Except that this isn’t the only test that runs into this problem:

  • sympref
  • @sym/dsolve

Forcing an error during these tests allows the project build to run to completion.