Ruuzis
Posts: 49
Joined: Wed May 21, 2014 11:35 am

video_decoder video_render tunnel dynamic reconfiguration

Tue Jul 08, 2014 2:12 pm

Hi,

Maybe someone has had a similar issue and can point out what i'm doing wrong - I'm playing an rtp H264 stream with changing resolutions. I have "video_decode" and "video_render" components instantiated (I did not use the ilclient.h library and made my own C++ objects that call the OMX core functions since having an event callback contained in each component class seemed more informative to me). I am parsing the stream and getting the first port change event, setting up the tunnel and the video is successfully rendered. The issue is that when the next resolution change happens a port change event is not signalled.

So what I am doing now:
* Create decoder and rendere, both in executing state, decoder input port enabled, buffers allocated, other ports on decoder/renderer are disabled
* Read RTP packet, extract NAL units, parse SPS if is found among the NAL units
* Feed the NAL units to the decoder
* If port settings changed event is signaled I'm calling OMX_SetupTunnel on the decoders output and renderers input ports, first time works like a charm, and enable them
* Video is rendered

Issues:
* When a new SPS with different resolution is encountered, what should I do ? I wanted to tear down the tunnel at this point and wait for the next port settings changed event to set it up again - i tried flushing all the ports, disabling and then setting tunnels from both of them to NULL (OMX_SetupTunnel(component, port, 0, 0)) but If i do this then the port settings changed event does not arrive when the new SPS if fed to the decoder
* I could skip the action above, and just continue feeding NALs to the decoder, then I get the next port settings changed event, but what then should I to do reconnect the tunnel? Flush and/or disable ports and/or do the OMX_SetupTunnel(...,...,0,0) and then call OMX_SetupTunnel again? Or is there an easier way?

Also if anyone has general knowledge of how a tunneled port connection should be dynamically reconfigured that could be very helpful.

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 7304
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: video_decoder video_render tunnel dynamic reconfiguratio

Tue Jul 08, 2014 3:02 pm

I'm a little rusty on the details of video_decode, but I can try to help based on how I believe it should work (I can't guarantee that it actually does work!).

First off, you can't dynamically reconfigure a port. OMX_IndexParamPortDefinition - param means that the port must be disabled or the component loaded to allow setting that parameter.

You shouldn't need to parse the bitstream at all - that is video_decode's job and obviously varies dependent on the codec.
On receiving the new SPS video_decode should raise a port settings changed event and stop producing buffers. At that point you should disable and clear down the tunnel to video_render, and then create it afresh. Video_decode should then resume decoding.

There may be a niggle in that internally video_decode allocates a set of images that the hardware decodes into as an internal format. If those image buffers have been sized based on the initial resolution and the new SPS signals a higher res, then we have an issue. There is OMX_IndexParamImagePoolSize that allows the client to specify the absolute maximum resolution that can be decoded (ie the size of those internal buffers). It appears to default to 1080P, so I'm not sure there is an issue here, but I'm also not the expert on video decode! All I can suggest is try it, and if you hit problems then try tweaking OMX_IndexParamImagePoolSize.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

Ruuzis
Posts: 49
Joined: Wed May 21, 2014 11:35 am

Re: video_decoder video_render tunnel dynamic reconfiguratio

Tue Jul 08, 2014 3:32 pm

Hi 6by9,

Thanks a lot for the response. Well, I'm not going higher than Full HD at this point, so I should not exceed the limit but I'll check that out. Thanks for the clarifications - my intention to bring down the tunnel earlier was only to reduce the latency since it is a real time video. I figured that , at least from what I see, that the port settings change is issued after a couple of the new frames have been decoded, so I thought that If I had already brought down the tunnel when I see the new SPS at RTP level I could save time in advance instead of bringing it down only after the settings have changed. I'll try re-enabling it on fly. Do I need to flush the ports or just clear and re-enable the tunnel?

Also, from what I described, do you have any explanations why after bringing down the tunnel the port settings changed event was not signaled any more? Because the initial event comes when the component is executing and the port is disabled, what I did in my SPS parsing code was also flush, disable ports and setup the tunnel to 0, so I could not see any reason why the event was not signaled any more.

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 7304
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: video_decoder video_render tunnel dynamic reconfiguratio

Tue Jul 08, 2014 5:17 pm

You can't go above Full HD. I was only flagging it as a potential issue as I had a recollection that it sized the pool to the size found in the headers rather than always 1080P. Always assuming 1080P wastes a lot of memory. Code appears to disagree with me, but I haven't had a chance to check.

Your code may have seen the SPS, but has the decoder necessarily processed all the buffers to mean it has seen the SPS? There's a FIFO on the input. If you've disabled the output port before the decoder can have processed all the input buffers before the SPS, then it will sit there waiting to clear them first. The first port settings changed event should occur after the first frame has been decoded.

BTW I assume your new SPS does actually change the resolution. If you insert an extra SPS selecting the same resolution, then it won't trigger a new port settings changed event.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

Ruuzis
Posts: 49
Joined: Wed May 21, 2014 11:35 am

Re: video_decoder video_render tunnel dynamic reconfiguratio

Tue Jul 08, 2014 5:37 pm

Yeah, I saw a post in another thread that stated that all the input is rescaled to the native resolution of the display so I assume that everything will be rescaled to 1080P in the video_render, which is nice since the displays handle non native resolutions poorly.

About the SPS - well yeah, I am parsing all the SPS units I get and triggering the event only when the new SPS has a different resolution as the current one, and of course I am also pushing all the NAL units to the input of the decoder. The thing about the buffers makes total sense, since when I tried to tear down the tunnel at this point without disabling it it returned OMX_ErrorPortUnpopulated , that could mean that buffers are still pending. But doesn't the flush command clear all pending buffers or am I getting something wrong?

Ruuzis
Posts: 49
Joined: Wed May 21, 2014 11:35 am

Re: video_decoder video_render tunnel dynamic reconfiguratio

Tue Jul 08, 2014 10:15 pm

I tried removing all the SPS detection magic and sticked with this:

Code: Select all

	void OnPortSettingsChanged(Omx::CComponent* pComponent, unsigned int nPortId)
	{
		if (pComponent == &m_VideoDecoder && nPortId == 131)
		{
			if (m_bTunnelActive == true)
			{
				m_VideoDecoder.GetOutputPort().Disable(false);
				m_VideoRenderer.GetInputPort().Disable(false);
				Omx::CTunnel(&m_VideoDecoder.GetOutputPort(), NULL);
				Omx::CTunnel(&m_VideoRenderer.GetInputPort(), NULL);
			}
			Omx::CTunnel(&m_VideoDecoder.GetOutputPort(), &m_VideoRenderer.GetInputPort());
			m_VideoDecoder.GetOutputPort().Enable();
			m_VideoRenderer.GetInputPort().Enable();
			m_bTunnelActive = true;
		}
	}
there are no errors but the rendering does not resume. So basically what happens is if the tunnel was active I disable the ports, reset the tunnels and create a new one. Should I do something more? The false parameter indicates not to wait for completion event, otherwise the calls blocks.

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 7304
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: video_decoder video_render tunnel dynamic reconfiguratio

Wed Jul 09, 2014 10:49 am

Flush requests that all buffers are returned to the buffer supplier (read the OMX spec, normally the output port unless overridden) as soon as possible.

Your code looks reasonable to me. Any chance you could throw a binary and a couple of instructions onto github or similar so I can give it a quick test and see what state the GPU is in?
Also have you tried sudo vcdbg log msg to see if there are any useful messages logged by the GPU? If you do sudo vcgencmd set_logging level=0xc0 first, then you should get all the logging from the IL components that's going.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

Ruuzis
Posts: 49
Joined: Wed May 21, 2014 11:35 am

Re: video_decoder video_render tunnel dynamic reconfiguratio

Wed Jul 09, 2014 11:19 am

Hi,

thanks for the response. Ok, so I guess the flush should prevent me from the OMX_ErrorPortUnpopulated error.

I'll try that, i just need to dump the rtp stream to a h264 file with start codes but that should not be an issue, and then I'll upload my source somewhere too, maybe all the C++ stuff actually helps someone out - the ilclient.h is great for fast development, but if you need more control of what is happening I guess you still need some random stuff. I'll post information when I have uploaded everything and thanks for the help. I mean, GStreamer is able to handle this (although I'm not 100% sure if it's not tearing down the decoder), so It must be something in my call sequence.

Ruuzis
Posts: 49
Joined: Wed May 21, 2014 11:35 am

Re: video_decoder video_render tunnel dynamic reconfiguratio

Wed Jul 09, 2014 1:04 pm

Hi 6by9,

Sorry for the amount of questions, but since Raspberry Pi seems to be the most suitable thing for the project i'm working on, then I have no other choice than to go through this:D I reconsidered other alternatives for the tunnnel:

* Tearing down the render and instantiating a new one on format changes but I guess that would waste time...
* Allocating the buffers myself and calling EmptyThisBuffer on the render and FillThisBuffer myself - since you know the internals I wanted to ask what is the performance loss here? If a tunnel is used do the both components somehow use a shared memory or something so that the decoding happens in place or do they still simply allocate buffers and call these 2 commands? If there is no performance penalties connected with excessive memory copying I could go that way and then update the port definitions and supply buffers myself.

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 7304
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: video_decoder video_render tunnel dynamic reconfiguratio

Wed Jul 09, 2014 1:38 pm

* Tearing down the render and instantiating a new one on format changes but I guess that would waste time...
No point. The render component gets reset by disabling and reenabling the input port.
* Allocating the buffers myself and calling EmptyThisBuffer on the render and FillThisBuffer myself - since you know the internals I wanted to ask what is the performance loss here? If a tunnel is used do the both components somehow use a shared memory or something so that the decoding happens in place or do they still simply allocate buffers and call these 2 commands? If there is no performance penalties connected with excessive memory copying I could go that way and then update the port definitions and supply buffers myself.
Allocating the buffers yourself won't help. The buffer has to match the port definition otherwise the renderer doesn't know how to interpret the data. You can't change the port definition on an active port, so you'd still have to disable ports, GetParameter(PortDefinition) on decode output port, SetParameter(PortDefinition) on the render input port, and then re-enable the ports. The tunnel is doing most of that for you.
There is also a significant performance gain by using tunnels. The two components do support a proprietary tunneling mechanism that passes a reference to the frame rather than the actual frame, and that is negotiated when the tunnel is set up. The decoder hardware block doesn't produce standard YUV420PackedPlanar, so if you allocate the buffers yourself, then we end up having to do an image conversion to YUV420PackedPlanar.

Video decode should pick up again when the output port has been re-enabled with the correct port definition. That's the thing that I'd like to see checked, and if there is a GPU bug then we'll fix the firmware.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

Ruuzis
Posts: 49
Joined: Wed May 21, 2014 11:35 am

Re: video_decoder video_render tunnel dynamic reconfiguratio

Thu Jul 10, 2014 8:33 am

Hi 6by9,

Ok I pushed an example (sources/binary/.h264 input file) to https://github.com/rubu/RpiOmxTunnelReconfiguration. Basically just take the RpiOmxTunnelReconfiguration.bin and place sample.h264 in the same directory and run it, it outputs some debug information and reads the file and feeds it to the decoder, if the file ends, it waits for ENTER to exit. If you can see anything to give me some hints that would be great. I'm developing using Visual Studio and VisualGDB but I thinkg that the makefile that is in the git should also work with make if you are interrested in rebuilding.

When I ran it like 10x times tommorrow once or twice i got a bad state error, so I wanted to reassure - is enabling/disabling component ports in a callback safe?

Ruuzis
Posts: 49
Joined: Wed May 21, 2014 11:35 am

Re: video_decoder video_render tunnel dynamic reconfiguratio

Fri Jul 11, 2014 1:01 am

Hi 6by9,

I did a lot of debugging and tried to tear down the tunnel asynchronously (without waiting on the port disabled event on the renderer), and it seems like the video renderer port is not disabled, because I get a new port settings changed event while waiting on the video render input port being disabled. Is that normal?

Ruuzis
Posts: 49
Joined: Wed May 21, 2014 11:35 am

Re: video_decoder video_render tunnel dynamic reconfiguratio

Fri Jul 11, 2014 8:44 am

Hi 6by9,

did you have any spare time to look at the binary sample? :)

Ruuzis
Posts: 49
Joined: Wed May 21, 2014 11:35 am

Re: video_decoder video_render tunnel dynamic reconfiguratio

Mon Jul 14, 2014 9:28 pm

Hi,

thanks, the issue really was that I was calling the disable/flush commands from the callback thread, so my applications logic was to blame, moved the blocking event handlers out in a separate thread, now works like a charm. I left everything in git in case anyone else finds it useful and thanks for the help 6by9, much appreciated.

dom
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 5331
Joined: Wed Aug 17, 2011 7:41 pm
Location: Cambridge

Re: video_decoder video_render tunnel dynamic reconfiguratio

Mon Jul 14, 2014 11:16 pm

Ruuzis wrote:thanks, the issue really was that I was calling the disable/flush commands from the callback thread, so my applications logic was to blame, moved the blocking event handlers out in a separate thread, now works like a charm. I left everything in git in case anyone else finds it useful and thanks for the help 6by9, much appreciated.
Yes, callbacks should complete quickly (typically just signal another thread) as they will block vchiq (the message passing interface to GPU).
Calling more openmax calls (which use vchiq) will have a high chance of deadlocking.

Ruuzis
Posts: 49
Joined: Wed May 21, 2014 11:35 am

Re: video_decoder video_render tunnel dynamic reconfiguratio

Tue Jul 15, 2014 2:41 pm

Thank you both for help:) My issues are sovled and now I can decode a dynamic RTP stream perfectly, hope that the C++ wrapper around OMX helps someone else too. And great thanks for making a version of auto reset event in the VCOS API - I am coming from Windows and used to WaitForSingleObject/WaitForMultipleObjects and the VCOS_EVENT_T and VCOS_EVENT_FLAGS_T are great.

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 7304
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: video_decoder video_render tunnel dynamic reconfiguratio

Mon Jul 21, 2014 5:06 pm

That'll teach me for not keeping this thread open in a tab - I was just starting to investigate it!

Glad you've got things solved. Interestingly that H264 stream you put on github throws up a myriad of errors when I pass it through our bitstream analyser. Things like:
  • SEI message too short (or stated payload size too long). payloadSize was 7, but message content was only 6 bytes long.
  • Invalid VLC for total_zeros in Table 9-7. TotalCoeff=1 and bit pattern='000000000' at position 0xc007 (dec. 49159), bit 5.
  • Failed to find NumCoeff/TrailingOnes (frame 18)
  • Expected NAL startcode after macroblock
  • rbsp_alignment_zero_bit = 1 (0x1), at position 0xc00d (dec. 49165), bit 1. Only allowable value is 0 (0x0).
  • The active SPS has been replaced, but the contents of the new SPS are not identical. The value of pic_width_in_mbs_minus1 differs (old value 39, new value 49)
And the analyser then actually falls over! VLC also crashes quite nicely on it. Makes me feel quite proud of our codecs people for writing code/hardware that can cope with such a bad stream.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

Ruuzis
Posts: 49
Joined: Wed May 21, 2014 11:35 am

Re: video_decoder video_render tunnel dynamic reconfiguratio

Thu Jul 24, 2014 1:52 pm

Hi 6by9,

Well the main thing is I got a solution in the end:) Btw, if you are reading this - I made a post about NALU formats supported by the decoder (http://www.raspberrypi.org/forums/viewt ... 38&t=81883), do you know if the signle nal per buffer and start code modes are limited by the hw or software? It is not a very huge deal, but if it is a software thing only then having two byte interleaved size mode would be great for RTP, but that is just a suggestion.

As for the stream, well yeah, I was only able to play the stream with ffplay, VLC crashed for me too. The stream is produced by an Intel hardware encoder and I don't mess with it, so... even the mighty fail :D Some of the errors could come from the fact that I have disabled AUD units, maybe the analyzer is actually expecting them. The error about the SPS ("The active SPS has been replaced, but the contents of the new SPS are not identical. The value of pic_width_in_mbs_minus1 differs (old value 39, new value 49)") made me curious - I guess that is what should actually happen when resolution changes, a new SPS with a new width and height comes in, so it is weird that the analyzer complains. Is this analyzer an open source product? I was just interested to see if errors occur in a fragment without resolution changes, because now when I have advanced further on I see artifacts on high resolutions (1152x768 works fine, but 1920x1080 gets corrupted), but this comes together with a packet loss, so it may be caused only by that. Sorry for having so many questions at one post, but do you know if the Raspberry Pi network interface has any throughput limitations? Because streaming to a pc where the stream is decoded by ffmpeg works ok. I am using multicast currently, but will try unicast to see if that makes things better.

6by9
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 7304
Joined: Wed Dec 04, 2013 11:27 am
Location: ZZ9 Plural Z Alpha, aka just outside Cambridge.

Re: video_decoder video_render tunnel dynamic reconfiguratio

Thu Jul 24, 2014 3:14 pm

So many questions :D

I'm not the expert on the decoder, but can give some pointers that may be helpful.

No support for OMX_NaluFormatTwoByteInterleavedLength. It looks like there is a processing step before presenting the data to the actual codec hardware that could allow us to break up the buffer into multiple jobs, but I'd want to check with a colleague first (he may even be interested in implementing it if you're lucky).

The analyser was Tektronix MTS4EA, so definitely not open source (I dread to think how much it costs!). It throws up lots of other slightly spurious errors too on normal clips due to bitrate being slightly over, or too many thingy widgets in a wowser, so the output does have to be taken with a pinch of salt. It really is comparing against the letter of the spec, not what is generally supported.

1920x1080 giving corruption after a packet loss - if you've lost a packet then game over until the next I-frame. The LAN interface is actually on the USB bus on the Pi. There were some issues with USB, but most of those were resolved about a year ago (rough guess), so it should be fairly reliable now. Whether you can saturate the LAN is another question though (was your PC on Gigabit ether vs the Pi on 100MBit/s?). I know I can push a good 30Mbit/s over the USB (not the LAN - I'm running a PVR recording from DVB-S2 and streaming back out to a NAS) without too many worries, but receiving multicast may have different loadings.
Software Engineer at Raspberry Pi Trading. Views expressed are still personal views.
I'm not interested in doing contracts for bespoke functionality - please don't ask.

Ruuzis
Posts: 49
Joined: Wed May 21, 2014 11:35 am

Re: video_decoder video_render tunnel dynamic reconfiguratio

Thu Aug 28, 2014 5:20 pm

I wanted to say thanks to 6by9 for all the help, here is a little demo of a 16 * 720p raspberry pi driven video wall:

https://www.youtube.com/watch?v=eFAATNo ... GWc3rPcwaA

So far this concept works great, looking forward to making 16 * 1080 and 25 * 1080 implementations.

Btw, as for my post on the packet loss, I increased net.core.wmem_default to 4MB and no packets were lost any more:)

dom
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 5331
Joined: Wed Aug 17, 2011 7:41 pm
Location: Cambridge

Re: video_decoder video_render tunnel dynamic reconfiguratio

Sun Aug 31, 2014 10:37 am

Ruuzis wrote:I wanted to say thanks to 6by9 for all the help, here is a little demo of a 16 * 720p raspberry pi driven video wall:
Nice. Is this 16 separate h.264 streams each being decoded full size on the Pi's, or one common h.264 stream with each Pi cropping the image to a different 1/16 section?

Ruuzis
Posts: 49
Joined: Wed May 21, 2014 11:35 am

Re: video_decoder video_render tunnel dynamic reconfiguratio

Mon Sep 01, 2014 6:59 pm

First scenario, the OS is being "told" that it has 16 monitors, each of which is a 22" LCD + Raspberry Pi.

Return to “Graphics, sound and multimedia”