Re: MSI

From: Alexander Kolesnikoff (2:5020/400)
To: Valentin Nechayev (2:5054/37.63)
Date: 2006-11-18T17:42:26Z
Area: RU.UNIX.BSD
From: Alexander Kolesnikoff <ak@hvv.uku.com.ru>

Valentin Nechayev <netch@segfault.kiev.ua> wrote:
> 
>>>> Slawa Olhovchenkov wrote: 
> 
> EG>> Что такое PCI Message Signalled Interrupts, поддержка которого
> EG>> для гигабитных карт появилась в CURRENT?
> 
> SO> Я так догадываюсь -- что-то умнее базмастеринга. Разгружает CPU и
> увеличивает SO> эфективность.
> 
> Криво догадываешься.:) Судя по описанию - метод транспорта запроса
> на прерывание не через стандартный раутинг (где, например, 4 линии
> на PCI разъём), а сообщением которое пишется в inbound memory APIC'а
> и уже средствами APIC'а вызывает прерывание в процессоре.
> 
> Технология, конечно, извратная как почти всё у Intel, но оптимизацию
> может дать, а особенно - избавление от проблемы с sharing
> interrupts. И если ссылки не врут - можно вместе с запросом передать
> ещё несколько байт информации сообщения, чтобы совсем уже ничего
> лишнего не дёргать:)

  But several advantages should encourage use of MSI:

    * Avoid EOI write: One indirect function call is needed to generate an
      End Of Interrupt (EOI) write to the IO SAPIC because PCI uses level
      triggered Interrupts. This cost is hidden from device driver writers
      but is required to indicate when the OS thinks a level triggered
      interrupt has been serviced. If the IRQ line is still asserted when
      the EOI write reaches the IO SAPIC, another interrupt transaction will
      be generated. Though MMIO writes are posted, the IO bus bandwidth and
      some number of CPU cycles are consumed.

    * Exclusive Vector: The device driver can avoid an indirect function
      call for avoid both shared PCI IRQ line and shared CPU Vector. IO
      SAPIC implementations to date typically only have 7 IRQ lines - enough
      for several single function PCI devices. Several Multi-function PCI
      devices (eg 4-port 100BT) will result in shared IRQ lines. Shared CPU
      vector should only occur in very large systems under rare
      circumstances.

    * DMA ordering: Normally, the IRQ line bypasses the DMA data path. Thus
      race conditions exist where a DMA might not reach the cache coherency
      "domain" before the IRQ is delivered and acted upon. For PCs and the
      like this typically isn't a problem since the IO paths are short.
      Similarly, the HP ZX1 chip set places an IO SAPIC on each PCI Host bus
      adapter. This results in the interrupt transaction getting delivered
      after any previous DMA from the PCI bus.

      However, the IA64 architecture allows the IO SAPIC to be placed
anywhere in the system topology. For larger systems, this can be a problem.
When the interrupt is a transaction on the bus, PCI ordering rules prevent
the MSI from bypassing any inbound DMA transaction. Thus, when the interrupt
finally reaches the CPU, one can be certain all DMA has reached the cache
coherency (eg memory) as well and not stuck in any coalescing buffers
between the IO device and the destination memory. Thus one doesn't need any
additional magic to guarantee the in-flight DMA is coherent with CPU caches.

    * Target multiple CPUs: This is wish list. Given the right services, a
      smart device can target transaction completions at different CPUs by
      generating interrupt transactions for specific Local SAPICs. One goal
      might be to service the interrupt on the same CPU that initiated the
      transaction. Tradeoffs between driver D-cache footprint and interrupt
      latency would help determine applications for this. High performance
      Clustering folks were looking at this but I've not heard of any
      prototype efforts.

  Что-же здесь извратного и про какие извраты Intel идёт речь?

 Alexander
--- ifmail v.2.15dev5.3
 * Origin: UKU (2:5020/400)
SEEN-BY: 50/12 400/814 450/159 1024 461/43 132 640 469/999 4616/3 4625/8
SEEN-BY: 4641/444 5000/76 5000 5006/1 5007/1 5010/70 5011/13 5012/46 5015/28
SEEN-BY: 5019/31 5020/18 175 194 400 545 982 1057 1909 1922 2238 2395 2871
SEEN-BY: 5020/4441 5021/29 5025/3 5026/14 45 5027/12 5030/1080 1957 5034/10 13
SEEN-BY: 5035/3 38 5036/1 5045/7 5049/1 5051/15 5054/1 4 8 9 11 28 35 36 37 45
SEEN-BY: 5054/63 66 67 70 75 84 85 5059/9 5060/88 5061/15 5062/10 5063/3
SEEN-BY: 5064/7 5066/18 5075/5 5076/1 5077/70 5080/1003 5084/9 5085/13 5095/20
SEEN-BY: 5096/18 6001/10
PATH: 5020/400 545 5054/1 37