j-marjanovic.io


About


Atom feed


Notes from FOSDEM 2019

I attended FOSDEM 2019 in Brussels:

Brussels

and these are my notes from Quantum Computers, CAD and Open Hardware and Python tracks:

Quantum Computing

Delivering Practical Quantum Computing on the D-Wave System

Intro

  • marketing slide for D-Wave Leap
  • practical = Adiabatic Quantum computer
  • Riggeti uses gate model instead
  • physical impl: 3 m high box + 3 normal racks
  • QPU (Quantum Processor Unit): 16x16 grid

theory

  • language
    • qubit
    • coupler (either both in same direction or opposite)
    • weights (? initial state)
    • strength (for couplers)
    • objective (function which gets minimized)
  • BQM (?)
  • 5 us annealing
  • problems: noise
  • noise can bring you in more than one solution (might be useful for some problems)

Markets

  • portfolio
  • internet ad
  • high-energy physics
  • image recognition

Q & A

  • clock speed -> no clock speed
  • total computation time -> 5 us to 1 s
  • Pegasus: proposed arch for a new machine
  • new Hamiltonioan in Pegasus
  • error corrections: from 2048 bits, no error corrections because of what they calculate
  • classical solutions vs quantum solutions: wall-clock time is the benchmark
  • hello world:
  • D-Wave: not a threat for a normal crypto (factoring a number is not a good problem)

D-Wave's Software Development Kit

  • tools and utilities for QC development

motivation

  • solution space is "smooth": good solutions are "grouped together"
  • step 1: problem as polynomial

equation

  • b-terms: linear bias
  • a-terms: quadratic bias

Quantum Machine Instr: biases in certain range: because of physical limitaiton? * Chimera graph and Pegasus graph

Ocean software

  • Python front-end, C++ for high performance

  • mapping methods

  • samplers
  • compute resources

dimod

  • API for samplers
  • BQM

cloud client

minorminer

dwavebinaryscp

constraint satisfaction

dwave-networkx

graph theory problem, same API as networkx

Steps

  1. translate to binary
  2. define BQM function
  3. BQM to matrix form
  4. BQM through sampler
  5. post-processing and interpretation

Conclusion

pip install dwave-ocean-sdk

Q & A

  • resolution: 0.01 resolution
  • Fujitsu and Hitachi: another providers
  • 8 couplers per qbit

D-Wave Hybrid Framework

  • decomposition -> split the problem to fit into BQM

github.com/dwavesystems/dwave-hybrid

  • solver/sampler framework
  • uses both quantum and classical resources
  • dataflow paradigm

What is IBMQ

quantum algorithms

  • current algorithms --> quantum algorithms
  • supra-polynomial speed-up
  • Schor's algorithm: polynomial time for factoring of the numbers
  • simulating quantum mechanics (Hamiltonian equations); for chemistry
  • factoring 1024-bit number: hours with QC
  • how to program a QC: mapping interference pattern on qbits

  • entanglement -> consistent quantum system -> colapse

  • every quantum program: circuit (no feedback!)
  • result is non deterministic (robust algorithms provide good results)

  • quantum volume (metric used at IBM)

  • coherence time: 100s of us

  • technology is quantum ready

quantum technologies

  • superconductive Josephson junction
  • entaglion (only hypothetical?)
  • QASM (quantum assembly language)
  • 5 GHz, 240 mK, low noise
  • all QC look similar: only way to do it
  • current state: oscilloscopes, signal generators, ...
  • pizza box for controlling the QC in the future

quantum advantage

  • IBM provides a lot of stuff on their GitHub
  • Jupyter notebooks, vscode plugin
  • Quiskit Acua: example problem: bonding energy for a molecule
  • Quiskit Aer: ...
  • publicly available QC
  • gate error and readout error are publicly available (~1e-3 for the example)
  • IBM has several QC architectures (Tokyo, Melburne)

Q & A

  • noise problems for QC, do qbits produce noise -> engineering will find a solution
  • quantum volue of Tokyo -> "the best, it is not just the number of qbits"
  • Quiskit Aqua: chemistry API

CAD and Open Hardware

gnucap

intro

  • mixed-signal simulator
  • prototype for Verilog-AMS (~10 years ago)
  • analog circuits and digital circuits simulators are different (transient vs event-based)

analog circuit simulation

  • node equations --> matrix form
  • often: differential equations, Newton iteration

digital circuit simulation

  • event based, with evaluation queue
  • can be used

Gnucap

  • Gnucap decompose the circuit matrix into L and U
  • Gnucap keeps track of the changes to the matrix, schedules an update to the circuit matrix
  • bypass = not computing something
  • Gnucap uses all the tricks to calculate inverse of the matrix (pivoting)

architecture of Gnucap

  • the concept (matrix solving + event-based updates) is tighly integrated in the codebase
  • plugin infrasctructure: modeling languages (VHDL, Verilog-AMS, SystemC considered)
  • shared library for basic s

  • compoenents are plugins (dlopen)

  • components

  • commands
  • algorithms

plugins

  • Turing complete
  • examples of plugins
  • import module (python)
  • insmod module (linux)

  • gnucap-python, e.g. Jupyter, user can access internal data, use Scipy, ...

  • Verilog-A in QUCS/gnucsator

license for models

  • Gnucap supports models from other sources

  • two types:

  • distributed as source code: --> just log it into the Gnucap (no issues)

  • distributed as binary: --> wrapper + blob

summary

  • mixed-mode is faster
  • more front-end work needed

ngspice

  • talk is not about the details, but about the framework, user interface and future

intro

  • input: standard SPICE text inpu
  • output: transient simulator
  • successor of spice3f5 from Berkley

  • three flavors:

  • standard executable: CLI, file and graphics output, control language

  • shared library for tcl/tk (not used so much)
  • C shared library (so/dll): input and output via callbacks

scripting language

  • its own library (developer don't like python)
  • 94 commands, math functions, loops, ...

device models

  • hard-coded models (BJS, MOS, JFET, xFET, trans lines, Verilog A interface via adms)
  • B source with build-in function
  • XSPICE shared library (written in C, both analog and digital)

application areas

  • PCB design
  • mix of ICs and discrete components
  • requires a comfortable user interface (offered by 3rd parties - e.g. KiCAD)
  • PSPICE and LTSPICE model requirements

  • IC design

  • models from the foundries (very reliable but complex)
  • supports HSPICE
  • MOS models, large circuits, certain speed
  • integration with other tools ongoing

mixed-signal capabilities

  • from XSPICE
  • digital: event based, signal strengths and delays
  • analog: C coded models, time and freq domain
  • simple example: digital is 50x faster than analog

experimental developments

  • KLU solver: 2x, 3x faster
  • CUDA for GPU: development on-going
  • Cider: 1D and 2D TCAD: device structure, solve physics equations

licenses

  • core: BSD, LGPL, ... no issues
  • Verilog A models: more complicated
  • vendor devices: can be used, but not distributed
  • IC model data: PDKs are under NDA (also encryption)

future

  • unicode
  • some commands (pz, ...)
  • integration with other tools and flows

openEMS - An Introduction and Overview

intro

  • FOSS solver for electromagnetics fields
  • simulate and evaluate RF and optical devices
  • uses FDTD (finite differences in time domain)
  • co-ordinate systems: cylindrial and cartesian
  • lumped elements available
  • human body models
  • dispersive models
  • support for remote simulation (cluster)

show cases

  • notch filter, very nice demo
  • examples: helical antena, antenna array, MRI antenna design (loop coils)
  • small size PCB antenna

interfacing

  • nice to have: interface to PCB editors
  • problem: link (between EMS and PCB editors) is very week

  • some examples:

  • hyp2mat
  • pcb-rnd
  • pcbmodelgen (KiCAD to openEMS)

  • ultimate goal: Circuit simulation <--> PCB design <--> RF simulation

status

  • openEMS is a mature EM simulation package
  • TODO list: improve the documentation, interface to tools, ...

Project Trellis and nextpnr

ECP5

  • 85k logic cells (4 LUTS, FF, carry), block RAM, 18x18 DSPs, SERDES (up to 5 GTs)
  • split into tiles, tiles split into slices
  • fixed wires
  • arcs and pip
  • all arcs and wires are undirectional - mux topology
  • dedicated clock network
  • programmable interconnect: pass gates (cascade of 2 mux)

status

  • bit and routing done
  • missing: DSP
  • timing documentation for fabric, logic cells, RAM, ...

text configuration format

  • tools to convert from and to bitstream
  • intermediate format for place & route

timing

  • not enought vendor support
  • delays for the cells extracted from SDF files
  • routing delay obtained using least-squares from reports for entire net

workflow

yosys

  • support ECP5, iCE40, Xilinx, ...
  • uses Berkley ABC for logic optimization
  • formal equivalence checking, assertions
  • ....

nextpnr

  • replacement for arachnepnr
  • developments from May 2018
  • timing-driven
  • architecture implements an API: useful for different architectures
  • each arch has its own binary: a lot of optimization possible
  • 7-series is VERY experimental (more work planned)
  • first implementation:
  • SA placer
  • A*+ripup router
  • future
  • analyty placer
  • SAT-based placer,
  • ...
  • nice graphical interface

Design Automation in Wonderland

intro

  • motivation and goals
  • reuse common functionality
  • easy to integrate, easy to adapt libraries
  • a set of modular libraries
  • based on Berkley ABC

  • use C++14 or C++17

  • header-only
  • well documented, well tested

libraries

lorina: parsing library

  • can pasrse very simple Verilog
  • parser reads Verilog and provides data to mockturtle

mockturtle: logic network library

  • network interface API
  • logic synth, opt, technology mapping
  • impelementations: and-inverted, kLUT, ...
  • performance tweeks

  • cut the combinatorial network into LUTs, based on cost function (speed/area)

kitty: truth table

  • manipulation of truth table

percy: exact synthesis library

  • re-synthesis

conclusion

  • exctract the logic function
  • optimization
  • mapping to tech

github.com/lsils

Open source virtual prototyping for faster hardware and software co-design

  • virtual prototyping

current development

  1. idea
  2. SW and HW developed in parallel
  3. integration

virtual prototype

  1. idea
  2. virtual prototype --> SW can be developed in parallel

virtual prototyping: SW environment simulating the HW

example

  • model entire SoC (RPi: quad core, peripheral)

  • issues:

  • models (of the IP) are hard to find
  • too much components --> needs to be done: shared effort

?

"interoperability is the key" --> take advantage of the community

  • toolchain: marketplace for components, GUI, ...

Q & A

  • interface with SystemC and TLM: support for TLM is there
  • modules: how to verify the model: ?

Lesson learned from Retro-uC and search for ideal HDL for open source silicon

intro

  • idea: open source microcontroller (Z80, MOS 65 and 68000, 3 uC in one chip)
    • a development board
  • "VHDL and Verilog are not the right tools for the job"

RTL faults

  • clock is logic signal
  • if --> mux or flip-flop
  • synth vs non-synth
  • Verilog: block and non-blocking
  • FPGA vs ASIC
  • RTFLRM

improvements

  • signed
  • process(all)
  • generate
  • ...

"putting a lipstick on a pig"

new developments

  • TL-Verilog
  • SystemC/TLM
  • "good tools are proprierty" (Vivado HLS, Catapult)
  • Panda Bamboo
  • GAUT (gaut.fr)

  • MyHDL

  • Chisel/SpinalHDL --> "going in the right direction" --> first need to learn Scala
  • Migen/MiSoc/nmigen --> prefered by the speaker

Python

CPython Memory Management

motivation

  • what user needs to know
  • learn how to control (gc, sys.getrefcount)
  • memory leaks

allocation of memory

  • CPython has PyObject for everything
  • size: obj size < 512 bytes --> small, ele big
  • big objects: system allocator
  • small object: 3 levels, pools and arena

  • 8-byte alignment --> size idx: size / 8 - 1

pools

  • 4k size, objects of same size
  • blocks

arenas

  • encapsulate pools
  • containts 64 pools

object specificts

  • string interning (simple string)
  • small integers (-5 to 256)

garbage collection

  • reference counting
  • easy to find unused obj
  • no marking
  • memory overhead
  • no cyclical references

tools in python

  • two modules: gc and tracemalloc
  • plus: sys._debugmallocstats()

That's all folks, 'till next year!