Profiling Results

BenchmarkCosimulation CyclesLightningSim CyclesOmniSim CyclesCosimulation Runtime (in s)LightningSim Runtime (in s)OmniSim Runtime (in s)LightningSim Speedup (vs. Cosim)OmniSim Speedup (vs. Cosim)OmniSim Speedup (vs. LightningSim)
Fixed-point square root30303027.254.973.655.48×7.47×1.36×
FIR filter17217217220.122.431.948.23×10.37×1.25×
Fixed-point window conv35353528.303.693.157.67×8.98×1.71×
Floating-point conv35353549.782.422.4620.57×20.24×0.98×
Arbitrary Precision ALU36363624.172.122.0311.40×11.91×1.04×
Parallel loops32323226.812.342.1611.48×12.41×1.08×
Imperfect loops34343425.802.242.1311.52×12.11×1.05×
Loop with max bound31313124.762.252.1411.01×11.57×1.05×
Perfect nested loops40640640624.762.272.1210.91×11.68×1.07×
Pipelined nested loops40540540524.922.232.1911.18×11.38×1.02×
Sequential accumulators32323226.592.292.2011.61×12.09×1.04×
Accumulators + asserts33333327.132.302.3011.80×11.80×1.00×
Accumulators + dataflow31313227.262.292.1911.90×12.45×1.05×
Static memory example66666633.232.182.1215.24×15.67×1.03×
Pointer casting example40840840832.552.152.1315.14×15.28×1.01×
Double pointer example25252531.702.141.9114.81×16.60×1.12×
AXI-4 master17817717721.062.192.079.62×16.60×10.17×
AXIS w/o side channel52515119.122.061.949.28×9.86×1.06×
Multiple array access25225225224.322.182.0811.16×11.69×1.05×
Resolved array access13113113124.362.202.0511.07×11.88×1.07×
URAM with ECC11511511522.072.212.059.99×10.77×1.08×
Fixed-point Hamming25925925933.282.372.4614.04×13.53×0.96×
Unoptimized FFT261781261150261150153.532.782.9155.23×52.76×0.96×
Multi-stage FFT37703772372261.432.672.9323.01×20.97×0.92×
Huffman encoding10283102721027246.892.632.3217.83×20.21×1.13×
Matrix Multiplication10361036103626.332.612.5910.09×10.17×1.01×
Parallelized merge sort13113113148.792.272.1521.49×22.69×1.06×
Vector add with stream42614261426127.214.483.566.07×7.64×1.26×
FlowGNN
GIN
2603592603372603374219.8528.9011.97146.02×352.53×2.41×
FlowGNN
GCN
112836112561112561534.3330.9017.1817.29×31.10×1.80×
FlowGNN
GAT
172821728217282838.2441.6024.6020.15×34.07×1.69×
FlowGNN
PNA
3442063442063442063285.4530.5029.00107.72×113.29×1.05×
FlowGNN
DGN
110710110710110710996.1326.9011.7137.03×85.07×2.30×

Important Takeaways!

The above results have been published for version 2021.1 of the Vitis Development Suite.

The timing estimates provided by both LightningSim and OmniSim are 99.9% accurate with respect to the results from C/RTL Co-simulation.

LightningSim achieves a speedup of up to 146.02× over C/RTL Co-simulation.

OmniSim achieves a speedup of up to 352.53× over C/RTL Co-simulation

Both LightningSim and OmniSim provide the incremental design space exploration features, thereby achieving a speedup of up to 577× over C/RTL Co-simulation.