I2S peripheral info?

Post Reply
miceopede
Posts: 3
Joined: Fri Sep 09, 2016 1:52 pm
languages_spoken: english
ODROIDs: c2
Has thanked: 0
Been thanked: 0
Contact:

I2S peripheral info?

Post by miceopede »

I am working on a realtime audio project with the C2 and looked at the I2S driver.

Is there any way to request more technical information from Amlogic? Will the datasheet be updated, or can we generally expect its quality to improve?

1) I am looking into adding mmap support to the I2S kernel driver, but it appears the peripheral seems to read memory in a format that's not standard interleaved (LRLRLRLR...), but in chunks of LLLLLLLLRRRRRRRR... Is there some way to configure the peripheral to feed it standard, stereo, interleaved samples?

2) The I2S driver does not use buffer thresholds or interrupts to determine when the ALSA period has expired. Instead it registers a timer that fires every once in a while to check if a period has elapsed. This hurts latency and increases load for unnecessary interrupts. Is there documentation on how to configure the peripheral to generate interrupts for buffer state conditions?

The current data sheet is useful but not enough to do anything with.

User avatar
odroid
Site Admin
Posts: 37229
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English, Korean
ODROIDs: ODROID
Has thanked: 1723 times
Been thanked: 1120 times
Contact:

Re: I2S peripheral info?

Post by odroid »

We also tried to implement the Jackd with I2S DMA/mmap. But we couldn't make it due to very limited information.
Amlogic supports only Android platform officially and there has been near zero support for generic Linux platform.
If any SoC peripheral works on Android OS, Amlogic don't make any effort to improve the driver.

Anyway, I will try to ask about the interleaved data stream and the buffer threshold based IRQ.
But please don't expect something quick and useful. :(

crashoverride
Posts: 5315
Joined: Tue Dec 30, 2014 8:42 pm
languages_spoken: english
ODROIDs: C1
Has thanked: 0
Been thanked: 433 times
Contact:

Re: I2S peripheral info?

Post by crashoverride »

Since I recently had to deal with ALSA in a real time environment (video playback), I will offer some comments.

I do not believe there would be any benefit from mmap support. My understanding of the driver code is that I2S has a very small buffer. Without DMA and a large hardware buffer, its effectively the same as a regular copy/write operation. Since there is no IOMMU, only CMA memory would be suitable as a buffer anyway. Hardware has changed dramatically from the decades ago that the ALSA linux API was designed. The system is more than fast enough to deal with sound buffer copies considering its fast enough to deal with 4K video buffers. My application takes multi-channel 48Khz audio and performs a brute-force copy/swizzle in memory before sending it on to ALSA without mmap. The impact of this is negligible on performance. However, even if it were, ARM NEON can swizzle "for free" to manipulate interleaved/non-interleaved channels.

The optimal solution I found for driving ALSA was to create a thread to do it using snd_pcm_delay to figure out where the playback is at and snd_pcm_writei to drive the playback in a continuous loop.

brad
Posts: 1401
Joined: Tue Mar 29, 2016 1:22 pm
languages_spoken: english
ODROIDs: C2 N1 N2 N2+ H2 H2+ (64 bit ftw)
Location: Australia
Has thanked: 122 times
Been thanked: 196 times
Contact:

Re: I2S peripheral info?

Post by brad »

crashoverride wrote:Since I recently had to deal with ALSA in a real time environment (video playback), I will offer some comments.

I do not believe there would be any benefit from mmap support. My understanding of the driver code is that I2S has a very small buffer. Without DMA and a large hardware buffer, its effectively the same as a regular copy/write operation. Since there is no IOMMU, only CMA memory would be suitable as a buffer anyway. Hardware has changed dramatically from the decades ago that the ALSA linux API was designed. The system is more than fast enough to deal with sound buffer copies considering its fast enough to deal with 4K video buffers. My application takes multi-channel 48Khz audio and performs a brute-force copy/swizzle in memory before sending it on to ALSA without mmap. The impact of this is negligible on performance. However, even if it were, ARM NEON can swizzle "for free" to manipulate interleaved/non-interleaved channels.

The optimal solution I found for driving ALSA was to create a thread to do it using snd_pcm_delay to figure out where the playback is at and snd_pcm_writei to drive the playback in a continuous loop.
My reasoning is to run a jack audio server which by design for near realtime audio requires a mmap / DMA ALSA drivers to copy the buffers into shared memory which is accessed and manipulated by multiple processes running on the machine. There is an mmap emulation plugin for alsa named mmap_emul which I believes wraps the non mmap plugin to emulate mmap functions to support copying to and from shared memory within jack which I need to look at further.

In regards to the small buffer sizes do you think this is going to be a limitation for throughput or just minimise any possible benifits gained from mmap? Ideally using a low latency or realtime kernel the buffers and sample rates can be set to optimum values to support the small buffer size as really we want the input stream to be out of buffer and into shared memory asap and the same with any output stream. The sample rate, buffer sizes & buffer latency's can all be configured so im hoping I can get a working setup sometime soon.

crashoverride
Posts: 5315
Joined: Tue Dec 30, 2014 8:42 pm
languages_spoken: english
ODROIDs: C1
Has thanked: 0
Been thanked: 433 times
Contact:

Re: I2S peripheral info?

Post by crashoverride »

The I2S and other audio hardware is documented in the S905 datasheet. I2S output starts on page 200. The block diagram shows there is a 64x64(bit) FIFO. The registers indicate a max 32bit DDR (physical) address. There is no IOMMU to provide page faults or mapping.

Using any audio server such as Jack or PulseAudio only adds latency. There is no faster supported method to drive audio than talking directly to the ALSA driver. The discussion does puzzle me a bit because this is not the first thread where a question of latency was posed. ALSA is from a time when processors were measures in megahertz. Today, processors and DDR3 memory is measured in gigahertz yet audio remains measured in kilohertz. Audio latency is a function of how many samples are buffered rather than how fast the hardware is.

brad
Posts: 1401
Joined: Tue Mar 29, 2016 1:22 pm
languages_spoken: english
ODROIDs: C2 N1 N2 N2+ H2 H2+ (64 bit ftw)
Location: Australia
Has thanked: 122 times
Been thanked: 196 times
Contact:

Re: I2S peripheral info?

Post by brad »

crashoverride wrote: Using any audio server such as Jack or PulseAudio only adds latency. There is no faster supported method to drive audio than talking directly to the ALSA driver. The discussion does puzzle me a bit because this is not the first thread where a question of latency was posed.
I appreciate the information I really do, here is some understanding on my situation and why I need to use a jack server, its software DSP and various plugins. Jack requires a number of sample blocks to exist in shared memory for read/write at any one point in time. The optimal number depends on the sample rate being used and the number of frames per sample and gives us an audio latency through the DSP. This is in addition to any copying to and from the device buffers and any latency in the device.

Normally latency does not matter too much even for a DSP, but if we need to take an input channel, make some serious modifications to it in the DSP and send it back out on an output channel for realtime broadcast latency becomes a killer. Ive done some testing via USB and I can achieve sampling at 48000Hz, with 2 sample periods at 64 frames per block. That is 1.3ms per block by 2, so 2.6ms through the DPS. Latency in the drivers can also be an issue to and the time taken to place it in shared memory. The human ear also has a latency and to fool all sensitive ears we need under 10ms total ideally 7ms.

The C2 has more than enough cpu power to drive the DSP I just need a way to get the audio channels in and out of shared memory. Even with DMA driver emulation I suspect I can get close to the latency I need as there are quiet a few milliseconds of latency to play with.

miceopede
Posts: 3
Joined: Fri Sep 09, 2016 1:52 pm
languages_spoken: english
ODROIDs: c2
Has thanked: 0
Been thanked: 0
Contact:

Re: I2S peripheral info?

Post by miceopede »

crashoverride wrote: The optimal solution I found for driving ALSA was to create a thread to do it using snd_pcm_delay to figure out where the playback is at and snd_pcm_writei to drive the playback in a continuous loop.
Interesting. So your application loops continuously calling snd_pcm_delay, until the delay is <frames/per period? Then do the writei() with one period worth?

Were you able to measure real-world time latency for playback?

miceopede
Posts: 3
Joined: Fri Sep 09, 2016 1:52 pm
languages_spoken: english
ODROIDs: c2
Has thanked: 0
Been thanked: 0
Contact:

Re: I2S peripheral info?

Post by miceopede »

crashoverride wrote:The I2S and other audio hardware is documented in the S905 datasheet. I2S output starts on page 200. The block diagram shows there is a 64x64(bit) FIFO. The registers indicate a max 32bit DDR (physical) address. There is no IOMMU to provide page faults or mapping.
I have read the relevant datasheet. The FIFO as far as I can tell is fed through DDR, with the address programmed into various pointer registers. These addresses are DMA-able, as well as trivially mmap-able by userspace. I don't see why an IOMMU is relevant, there are no faults involved, and virtual->physical mappings for the I2S peripheral are not needed. What I'm asking is if more technical information could be requested for the datasheet, so that the IRQ alluded to in the datasheet could be used.
crashoverride wrote: Using any audio server such as Jack or PulseAudio only adds latency. There is no faster supported method to drive audio than talking directly to the ALSA driver. The discussion does puzzle me a bit because this is not the first thread where a question of latency was posed. ALSA is from a time when processors were measures in megahertz. Today, processors and DDR3 memory is measured in gigahertz yet audio remains measured in kilohertz. Audio latency is a function of how many samples are buffered rather than how fast the hardware is.
I understand the difference. I have enabled the MMAP interface, and applied appropriate patches in Jack to use the weird s905 layout. This yields in my application, a time from external trigger to sound output after the DAC at ~3ms. This is fine for my requirements. While Jack provides a few nice things that I use, I'm fine with using a direct ALSA interface if I have to. While this works for me, these patches could never go upstream because of the nonstandard layout. There is a non-trivial amount of software depending on this interface, Jack being one of them.

My main peeve is that in the driver snd_pcm_period_elapsed() is triggered by a periodic timer that simply polls to see if the period has elapsed and it can wake up the userspace thread. I use very short buffer sizes, so I've increased the timer frequency. This works for me, but is obviously a hack. I could probably make more elaborate hacks that try to tune the timer periods, but using the hardware IRQ is really the best thing for everyone, amlogic included.

I'm fine with writing my own driver. I'd even put the work into mainlining it. I'm just looking for more technical information on the part, so that a sane driver could be written. The amlogic code will never, ever, be mainlineable.

While I commend the freely available nature of the datasheet (unlike some other ARM vendors...), it could certainly use improvement in both translation quality and completeness. I suspect that if Amlogic did this, they would find an explosion of new drivers, bug fixes, and new features *for free,* especially as the Meson patches are really starting to hit mainline now.

crashoverride
Posts: 5315
Joined: Tue Dec 30, 2014 8:42 pm
languages_spoken: english
ODROIDs: C1
Has thanked: 0
Been thanked: 433 times
Contact:

Re: I2S peripheral info?

Post by crashoverride »

miceopede wrote: So your application loops continuously calling snd_pcm_delay, until the delay is <frames/per period? Then do the writei() with one period worth?
The snd_pcm_delay is only used for feedback as to the current audio playing position. The actual work is simply repeatedly calling snd_pcm_writei. The call blocks until it can complete. There is "zero" latency using this method meaning the feedback time matches the independent video time without any latency delta. The main point of the discussion was there will likely be no change in latency using copy/write versus mmap because the system is capable of moving/processing vast amounts of data within the sample time (Khz).

With regard to DSPs, they require a minimal sample set to operate on. The amount of latency introduced depends on source. If the source is generated or pre-recorded, it arrives at the DSP faster than real time and can possibly introduce no extra latency. However, if the source is real time, there will be an additional latency depending on the amount of samples the DSP needs in advance to function.

This is not to say that better information (datasheet) could not produce a better driver or that a better driver would not improve compatibility with applications like Jack. Rather its symptomatic of the driver API that a valid implementation produces such inconsistencies. The IOMMU discussion was to illustrate that there is no way a user allocated buffer (malloc) could be used directly by the sound hardware.

alexruedi
Posts: 7
Joined: Thu Jun 08, 2017 10:55 pm
languages_spoken: english,german
Has thanked: 0
Been thanked: 0
Contact:

Re: I2S peripheral info?

Post by alexruedi »

So it's possible to use I2S and jack server? Could you give me some details?

I tried with alsa_out and dummy driver but it causes clock sync issues after ~40 seconds.

Post Reply

Return to “General Topics”

Who is online

Users browsing this forum: No registered users and 2 guests