Hello everyone, this (basically) my first post--sorry, but it's a long one. I've developed a cross-platform C app for low latency desktop streaming via H.264, and I'm trying to bring it to the Pi. I'm using the OpenMax IL and ilclient (<--thanks for this!) libraries provided with Raspbian. Most of what I've done is based off of the /opt/vc/src/hello_pi/hello_video/video.c file found on Raspbian.
My server program sends a stream of encoded H.264 frames at precisely 60 FPS over TCP to the client. In my Windows and Mac OS versions of the client, I use platform specific hardware decoder libraries, then draw the raw frames with the SDL2 library (which uses OpenGL under the hood).
Based on the video.c example mentioned above, I have a working version on the Pi, but not without some issues. FYI, I have removed the "clock" and "video_scheduler" components, as they seemed cause choppier video compared to simply tunneling the "video_decode" component output port directly to the "video_render" input port. All testing was done on a Pi 3 with the current unmodified version (2016-05-10) of Raspbian Jessie.
My questions / issues are as follows:
1. The main issue: If I set the H.264 quality low, say to a QP of 36 for 1080p@60, the Pi is able to handle the incoming data smoothly. The average bitrate in this scenario is about 2-3 Mbps. Now if I up the quality to around 30 QP, things start breaking. The resulting bitrate increases to about 5-6 Mbps, with occasional spikes of around 10 Mbps. The strange thing is, it is not the ilclient or OMX libraries returning errors here--I am actually getting packet loss over TCP. The video is sporadically choppy, with occasional, noticeable delays between certain frames. Eventually, when trying to read my size header that I send with every frame, the value is clearly incorrect (i.e. a negative value), indicating packet loss, and the program exits. This may be more of a networking issue than an OpenMax issue, but I know that the Pi can support much higher bandwidth than 10 Mbps out of the box (I am using the built in wired connection). I have tinkered with almost every sysctl net.core and net.ipv4.tcp* setting you can imagine to no avail. My only theory is that there is some bottleneck on a shared bus that is affecting the TCP reads. CPU usage during the issue reads at a low 15-25% (of a single core). Overclocking the Pi with gpu_freq, force_turbo, and over_voltage did not seem to help either. The TCP reads occur in a separate thread and are not blocked by any other processing. My voltage is stable and I am not getting any voltage or temperature warnings. Any ideas here? UPDATE: I tested the program without any OpenMax decoding or rendering, the issue remains. So it must be a networking bottleneck of some kind...
2. Will the "video_render" OMX component always be the best choice for presenting the frames, or would it be possible to use SDL2 w/ OpenGL to achieve similar performance? If SDL2 is an option, what is the default raw pixel format for the frames that come out of the "video_decode" component, and can I modify this pixel format?
3. Is there a way to turn vsync off for the "video_render" component?
4. Is there a way to increase the buffer size obtained through ilclient_get_input_buffer? The buffers seem to default to 80KB.
5. I'm interested in setting the OMX_DataUnitCodedPicture to potentially reduce latency, as I read the frames from the network one frame at a time, but I'm not sure how to set it via OMX_IndexParamBrcmDataUnit. Any examples out there?
6. Any other tips for reducing potential buffering / latency in the decoding & rendering process?
Thanks in advance!