EDAboard.com | EDAboard.eu | EDAboard.de | EDAboard.co.uk | RTV forum PL | NewsGroups PL

DMA operation to 64-bits PC platform

elektroda.net NewsGroups Forum Index - FPGA - DMA operation to 64-bits PC platform

Goto page Previous  1, 2

Charles Gardiner
Guest

Tue Jul 06, 2010 10:31 am   



Frank van Eijkelenburg schrieb:

Quote:
We tried your suggestion (we were using BUFFERED_IO). Unfortunately it
was not the (final) solution.

Was there any noticeable change in the behaviour at all?

Is it still valid that your FPGA can _read_ data from the buffer when your
application writes it there?

With DO_DIRECT_IO specified, it's not clear to me off-hand why you are not seeing
the memory locations in both directions now.

Perhaps there are more causes for the
Quote:
problem. Anyway, thanks for your suggestion. We are almost out of
ideas of what we can test. Do you have other ideas or tests we can do
to find the cause? I hope to fix the problem before my vacation (only
one day left Smile

Oops, thats tight. I'm just on the way to a customers so I don't have my usual
references at hand. Have you tried the flush (zero length read from FPGA) after a
write to memory. Although, to be honest I don't think that's the solution (just a
straw to grab for in case your system has some caching behaviour I haven't seen
before). My last (KMDF based) design was similar to yours. The FPGA was streaming
to memory and the SW application reading from the buffer shared between
application memory and kernel memory. I never had any data loss, even without the
zero length read.

If you can send me as much relevant info as possible, I'll have another look this
evening.

Regards,
Charles

Frank van Eijkelenburg
Guest

Tue Jul 06, 2010 12:00 pm   



On Jul 2, 9:04 pm, Charles Gardiner <charles.gardi...@invalid.invalid>
wrote:
Quote:
Hi Frank,



I am not sure if we understand each other.

Yes, it certainly sounds like that.

What do you mean by
completing the request with IoCompleteRequest? There is no request
from software point of view.

I think this might clear up the reason why your data is missing. (See also below
about the type of DMA). I don't think the S/G list you are getting is describing
your application buffer. This is best done by specifying DO_DIRECT_IO as the DMA
method for your device. If you specify DO_BUFFERED_IO you will get an S/G List
describing an intermediate buffer in kernel space and this probably never gets
copied over to your application space buffer unless you terminate the request.
I've never done the 'neither' method myself and from what I hear, it's a
complicated beast.

The FPGA will do a DMA write (data from
FPGA to PC memory) at its own initiative. The allocated memory is used
as long as the software is running. I do not allocate new memory for
each new DMA transfer, but at startup a large piece of memory is
allocated and the physical addresses are written to the FPGA by the
driver software.

Sounds like you are doing something like a circular buffer in memory which stays
alive as long as your device does?



And yes, we use a DMA adapter in combination with the
GetScatterGatherList method. We already used this in another project
but that was PCI and DMA read (data from PC memory to FPGA).

By the way, where can I set the type of DMA?

Typically, you set the DMA buffering method in your AddDevice function after you
create your device object. Quoting from Oney's book,

NTSTATUS AddDevice(..) {
   PDEVICE_OBJECT    fdo;

   IoCreateDevice(....., &fdo);
   fdo->Flags |= DO_BUFFERED_IO;
            <or
   fdo->Flags |= DO_DIRECT_IO;
            <or
   fdo->Flags |= 0;  // i.e. neither Direct nor Buffered

And, you can't change your mind afterwards.

By the way if my assumption about the circular buffer in your design is correct,
there is a slightly more standard solution (standard in the sense that everybody
on the microsoft drivers newgroup seems to do it). It however requires two threads
in your application. The first one requests a buffer (using new or malloc) and
sets up an I/O Request ReadFile, WriteFile or DeviceIoControl referencing this
buffer. This is performed as an asynchronous request.

The driver recognises this request and pends it indefinitely, (typically terminate
it when your driver is shutting down, otherwise windows will probably hang).
Pending the request has the nice side effect that the buffer now becomes locked
down permanently.

Assuming you have set up your driver to use DO_DIRECT_IO DMA, you can get the S/G
list describing the application space buffer as you are currently doing and feed
this to your FPGA.

Using the second thread in your application you can constantly read data from the
locked down pages (you app. space buffer) that are being written by your FPGA.

Assuming the DO_DIRECT_IO solves your problem (I think there is a good chance), I
would however still consider migrating to a KMDF based driver, particularily if
you are writing a new one. It's much easier to maintain and is probably more
portable for future MS versions.



best regards,

Frank

best regards,
Charles

Hi Charles,

We tried your suggestion (we were using BUFFERED_IO). Unfortunately it
was not the (final) solution. Perhaps there are more causes for the
problem. Anyway, thanks for your suggestion. We are almost out of
ideas of what we can test. Do you have other ideas or tests we can do
to find the cause? I hope to fix the problem before my vacation (only
one day left :)

best regards,

Frank

Frank van Eijkelenburg
Guest

Tue Jul 06, 2010 12:00 pm   



On Jul 2, 9:04 pm, Charles Gardiner <charles.gardi...@invalid.invalid>
wrote:
Quote:
Hi Frank,



I am not sure if we understand each other.

Yes, it certainly sounds like that.

What do you mean by
completing the request with IoCompleteRequest? There is no request
from software point of view.

I think this might clear up the reason why your data is missing. (See also below
about the type of DMA). I don't think the S/G list you are getting is describing
your application buffer. This is best done by specifying DO_DIRECT_IO as the DMA
method for your device. If you specify DO_BUFFERED_IO you will get an S/G List
describing an intermediate buffer in kernel space and this probably never gets
copied over to your application space buffer unless you terminate the request.
I've never done the 'neither' method myself and from what I hear, it's a
complicated beast.

The FPGA will do a DMA write (data from
FPGA to PC memory) at its own initiative. The allocated memory is used
as long as the software is running. I do not allocate new memory for
each new DMA transfer, but at startup a large piece of memory is
allocated and the physical addresses are written to the FPGA by the
driver software.

Sounds like you are doing something like a circular buffer in memory which stays
alive as long as your device does?



And yes, we use a DMA adapter in combination with the
GetScatterGatherList method. We already used this in another project
but that was PCI and DMA read (data from PC memory to FPGA).

By the way, where can I set the type of DMA?

Typically, you set the DMA buffering method in your AddDevice function after you
create your device object. Quoting from Oney's book,

NTSTATUS AddDevice(..) {
   PDEVICE_OBJECT    fdo;

   IoCreateDevice(....., &fdo);
   fdo->Flags |= DO_BUFFERED_IO;
            <or
   fdo->Flags |= DO_DIRECT_IO;
            <or
   fdo->Flags |= 0;  // i.e. neither Direct nor Buffered

And, you can't change your mind afterwards.

By the way if my assumption about the circular buffer in your design is correct,
there is a slightly more standard solution (standard in the sense that everybody
on the microsoft drivers newgroup seems to do it). It however requires two threads
in your application. The first one requests a buffer (using new or malloc) and
sets up an I/O Request ReadFile, WriteFile or DeviceIoControl referencing this
buffer. This is performed as an asynchronous request.

The driver recognises this request and pends it indefinitely, (typically terminate
it when your driver is shutting down, otherwise windows will probably hang).
Pending the request has the nice side effect that the buffer now becomes locked
down permanently.

Assuming you have set up your driver to use DO_DIRECT_IO DMA, you can get the S/G
list describing the application space buffer as you are currently doing and feed
this to your FPGA.

Using the second thread in your application you can constantly read data from the
locked down pages (you app. space buffer) that are being written by your FPGA.

Assuming the DO_DIRECT_IO solves your problem (I think there is a good chance), I
would however still consider migrating to a KMDF based driver, particularily if
you are writing a new one. It's much easier to maintain and is probably more
portable for future MS versions.



best regards,

Frank

best regards,
Charles

Hi Charles,

We tried your suggestion (we were using BUFFERED_IO). Unfortunately it
was not the (final) solution. Perhaps there are more causes for the
problem. Anyway, thanks for your suggestion. We are almost out of
ideas of what we can test. Do you have other ideas or tests we can do
to find the cause? I hope to fix the problem before my vacation (only
one day left :)

best regards,

Frank

Michael S
Guest

Tue Jul 06, 2010 12:44 pm   



On Jul 6, 11:00 am, Frank van Eijkelenburg
<fei.technolut...@gmail.com> wrote:
Quote:
On Jul 2, 9:04 pm, Charles Gardiner <charles.gardi...@invalid.invalid
wrote:



Hi Frank,

I am not sure if we understand each other.

Yes, it certainly sounds like that.

What do you mean by
completing the request with IoCompleteRequest? There is no request
from software point of view.

I think this might clear up the reason why your data is missing. (See also below
about the type of DMA). I don't think the S/G list you are getting is describing
your application buffer. This is best done by specifying DO_DIRECT_IO as the DMA
method for your device. If you specify DO_BUFFERED_IO you will get an S/G List
describing an intermediate buffer in kernel space and this probably never gets
copied over to your application space buffer unless you terminate the request.
I've never done the 'neither' method myself and from what I hear, it's a
complicated beast.

The FPGA will do a DMA write (data from
FPGA to PC memory) at its own initiative. The allocated memory is used
as long as the software is running. I do not allocate new memory for
each new DMA transfer, but at startup a large piece of memory is
allocated and the physical addresses are written to the FPGA by the
driver software.

Sounds like you are doing something like a circular buffer in memory which stays
alive as long as your device does?

And yes, we use a DMA adapter in combination with the
GetScatterGatherList method. We already used this in another project
but that was PCI and DMA read (data from PC memory to FPGA).

By the way, where can I set the type of DMA?

Typically, you set the DMA buffering method in your AddDevice function after you
create your device object. Quoting from Oney's book,

NTSTATUS AddDevice(..) {
   PDEVICE_OBJECT    fdo;

   IoCreateDevice(....., &fdo);
   fdo->Flags |= DO_BUFFERED_IO;
            <or
   fdo->Flags |= DO_DIRECT_IO;
            <or
   fdo->Flags |= 0;  // i.e. neither Direct nor Buffered

And, you can't change your mind afterwards.

By the way if my assumption about the circular buffer in your design is correct,
there is a slightly more standard solution (standard in the sense that everybody
on the microsoft drivers newgroup seems to do it). It however requires two threads
in your application. The first one requests a buffer (using new or malloc) and
sets up an I/O Request ReadFile, WriteFile or DeviceIoControl referencing this
buffer. This is performed as an asynchronous request.

The driver recognises this request and pends it indefinitely, (typically terminate
it when your driver is shutting down, otherwise windows will probably hang).
Pending the request has the nice side effect that the buffer now becomes locked
down permanently.

Assuming you have set up your driver to use DO_DIRECT_IO DMA, you can get the S/G
list describing the application space buffer as you are currently doing and feed
this to your FPGA.

Using the second thread in your application you can constantly read data from the
locked down pages (you app. space buffer) that are being written by your FPGA.

Assuming the DO_DIRECT_IO solves your problem (I think there is a good chance), I
would however still consider migrating to a KMDF based driver, particularily if
you are writing a new one. It's much easier to maintain and is probably more
portable for future MS versions.

best regards,

Frank

best regards,
Charles

Hi Charles,

We tried your suggestion (we were using BUFFERED_IO).

If you were using BUFFERED_IO why was your driver locking the pages?
In case of BUFFERED_IO the pages come from kernel non-paged pool and
don't have to be specifically locked. The only case where the driver
is responsible for locking/unlocking pages is NEITHER I/O.

Quote:
Unfortunately it
was not the (final) solution. Perhaps there are more causes for the
problem. Anyway, thanks for your suggestion. We are almost out of
ideas of what we can test. Do you have other ideas or tests we can do
to find the cause? I hope to fix the problem before my vacation (only
one day left :)

best regards,

Frank

Another typical mistake is driver forgets to call IoMarkIrpPending().
KMDF does it automatically, but in plain WDM it's responsibility of
your driver. However forgotten IoMarkIrpPending() normally shows
different symptoms.

Michael S
Guest

Tue Jul 06, 2010 1:12 pm   



On Jul 6, 11:00 am, Frank van Eijkelenburg
<fei.technolut...@gmail.com> wrote:

Quote:
I hope to fix the problem before my vacation (only one day left :)


Something, I most certainly DO NOT RECOMMEND for final solution, but
it could help to go to vacation in better mood.
Scrap all the schoolbook nice&complex Windows DMA API stuff. Instead,
take your Irp->MdlAddress, do MmGetMdlPfnArray() and access physical
addresses directly. It's wrong, it's immoral but on simple x86/x64 PC
or on small dual-processor server it always work.
Just don't forget to bring back the official DMA API when you are back
from vocation and have more time than a few hours.

Frank van Eijkelenburg
Guest

Wed Aug 11, 2010 6:50 pm   



On Jul 6, 12:12 pm, Michael S <already5cho...@yahoo.com> wrote:
Quote:
On Jul 6, 11:00 am, Frank van Eijkelenburg

fei.technolut...@gmail.com> wrote:
I hope to fix the problem before my vacation (only one day left :)

Something, I most certainly DO NOT RECOMMEND for final solution, but
it could help to go to vacation in better mood.
Scrap all the schoolbook nice&complex WindowsDMAAPI stuff. Instead,
take your Irp->MdlAddress, do MmGetMdlPfnArray() and access physical
addresses directly. It's wrong, it's immoral but on simple x86/x64 PC
or on small dual-processor server it always work.
Just don't forget to bring back the officialDMAAPI when you are back
from vocation and have more time than a few hours.

Finally, I solved the problem. For those who want to learn from
mistakes of others, here comes the cause of the problem:

The packets which were transmitted to the pc were too large (more than
the maximum payload size of the receiver). In that case, the packets
are simply dropped (no errors). Of course I have to read the maximum
payload size from the device control register in the PCI Express
Capability structure.

best regards,

Frank

Michael S
Guest

Thu Aug 12, 2010 1:40 am   



On Aug 11, 5:50 pm, Frank van Eijkelenburg
<fei.technolut...@gmail.com> wrote:
Quote:

Finally, I solved the problem. For those who want to learn from
mistakes of others, here comes the cause of the problem:

The packets which were transmitted to the pc were too large (more than
the maximum payload size of the receiver). In that case, the packets
are simply dropped (no errors). Of course I have to read the maximum
payload size from the device control register in the PCI Express
Capability structure.

best regards,

Frank

Thanks for the interesting update, Frank. I never even thought in that
direction.
Last time we did PCIe on FPGA we used Altera core with Avalon-MM
wrapper. This configuration doesn't support outstanding packets that
are longer than 256 bytes so, obviously, we were immune to maximum
payload size trap.

FPGA
Guest

Thu Aug 12, 2010 9:42 pm   



On Jul 1, 11:03 am, Frank van Eijkelenburg
<fei.technolut...@gmail.com> wrote:
Quote:
Hi,

I have a custom made PCIe board with a Virtex 5 FPGA on which I
implemented a DMA unit which uses the PCIe endpoint block plus v1.14.
I also implemented simple read/write operations from the PC to the
board (the board responds with completion TLPs). The read/write
operations are working, DMA is not working

The board is inserted in a pc with Windows 7 64 bits platform. An
application allocates virtual memory and passes the memory block to
the driver. The driver locks the memory and converts the virtual
addresses into physical addresses. These physical addresses are
written to the FPGA.

When I start an DMA operation, I can see in chipscope the correct
physical addresses in the TLP header. However, I do not see the
correct values in the allocated memory. What can I do to check where
it is going wrong?

Another question is about the memory request TLPs. What should I use,
32 or 64 bit write requests? Or do I have to check runtime if the
physical memory address is below or above the 4 GB (and use
respectively 32 and 64 bit requests)?

Thanks in advance,

Frank

Could somebody please help me to identify as to what the "4177" suffix
calls out on this specific Xilinx Virtex-5 device and if it is
compatible to the same device without this suffix.
XC5VTX240T-2FF1759I4177 vs. XC5VTX240T-2FF1759I . Avnet list them both
on their website but the with the "4177" suffix the price is roughly
$1,000.00 more?. It doesnt seem like it would be a specific customer
code as they advertize both to the public? Any help with this is
appreciated.

Tks,

Dave

Goto page Previous  1, 2

elektroda.net NewsGroups Forum Index - FPGA - DMA operation to 64-bits PC platform

Arabic versionBulgarian versionCatalan versionCzech versionDanish versionGerman versionGreek versionEnglish versionSpanish versionFinnish versionFrench versionHindi versionCroatian versionIndonesian versionItalian versionHebrew versionJapanese versionKorean versionLithuanian versionLatvian versionDutch versionNorwegian versionPolish versionPortuguese versionRomanian versionRussian versionSlovak versionSlovenian versionSerbian versionSwedish versionTagalog versionUkrainian versionVietnamese versionChinese version
RTV map EDAboard.com map News map EDAboard.eu map EDAboard.de map EDAboard.co.uk map Opony