The Paradigm of Probabilistic Programming
The paradigm of probabilistic programming has witnessed big developments in the last few years. From being seen as a quiet field related to only drawing inferences and implementing probabilistic models in statistical studies it is seeing increased adoption in many areas thanks to the advancements in machine learning and artificial intelligence. It is now finding use cases in study and simulation models of particle collisions in the Large Hadron Collider (LHC), churn prediction of customers, or in time-to-event modeling to understand the dimensions of user experience among others.
A team of researchers from MIT has recently presented a paper on a newly developed system for probabilistic programming called Gen at the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation. Gen is described as a general-purpose system built using Julia ( which is another product from MIT researchers released in 2012) the high-level dynamic programming language used for numerical analysis.
Gen is not the only probabilistic programming system, there have been others which have been catering to the needs of the field:
Edward – the Turing-Complete Language for Deep Probabilistic Programming;
Turing.jl – the Julia library for (universal) probabilistic programming;
Stan – the probabilistic programming language for statistical inference written in C++;
Pyro – the flexible, scalable deep probabilistic programming library built on PyTorch;
Brancher – an Object-Oriented Variational Probabilistic Programming Library built on PyTorch;
PyMC3 – a Python package for Bayesian statistical modeling and probabilistic machine learning which focuses on advanced Markov chain Monte Carlo and variational fitting algorithms;
Skpro – Supervised domain-agnostic prediction framework for probabilistic modeling;
Infer.NET – a framework for running Bayesian inference in graphical models by Microsoft;
Generality the modeling level means the ability to support representation and convergence of a wide range of probabilistic models and being able to derive inferences from them.
Algorithm efficacy means the ability to build custom, scalable and highly sophisticated models without sacrificing on performance and speed.
According to the MIT researchers, existing alternatives lack generality at the modeling level or lack of algorithmic efficiency when supporting generic modeling. Gen aims to achieve a fine balance of both.
Applications of Gen
Similar to what TensorFlow did for deep learning, Gen aims to take the statistical models and inference algorithms of probabilistic programming to the masses while taking care of the complicated equations, calculations, and efficiency.
Gen can be used in programming applications for use in Artificial Intelligence, robotics, computer vision, and statistical programming. Components in Gen are able to perform probability and statistical simulations, deep learning simulations and perform graphic rendering. Its statistical models can even perform data analytics tasks.
Intel and MIT team are in collaboration to build depth-sensing cameras using Gen which has use in robotics and 3D augmented reality. MIT Lincoln Laboratory is building aerial drones using Gen for the purpose of relief and disaster emergency response.
Gen is also being used in the Machine Common Sense (MCS) program by DARPA which aims to build a model of human common sense.
Models are not represented as code from a program but as black boxes. These black boxes help in developing inferences and share a common interface called the generative function interface(GFI). Code written in different languages can be combined into a single model with compilers of these different languages generating GFI method implementations.
Generative Functions are one of the core abstractions in Gen. They are used to represent probabilistic models and inference models. These functions are built-in but users have the ability to make changes and create their own custom generative functions. @gen is the keyword that acts as an identifier for generative functions. It is even possible to combine generative functions. The three generative function combinators are a map, unfold and recurse.
Traces are the records of execution of the generative function. Traces can be visualized using rendering functions. Trace update methods take traces as input and return traces as outputs which are results of changes made to the execution of the generative function.
Inference algorithms can be implemented and permitted to interact with the model only via GFI. Gen supports a high-level inference programming library. GenViz helps in visualization and debugging of inference algorithms.
AI Programming Language
Gen can be used as a powerful tool for programming artificial intelligence and its implementations. Probabilistic programming and statistical modeling bring in a principled approach that can be implemented for solving and inferencing a variety of complex problems. Gen makes rapid iteration and prototyping possible which improves productivity.
Probabilistic AI is one of the most promising frontiers of deep learning and artificial intelligence. Data is continuously being processed and probabilities generated by inference algorithms are continuously being updated based on learnings from new data. Changing probabilities require changes in the model. The model receives constant updates to be able optimized to make predictions based on changing data.
Gen documentation is available here: https://probcomp.github.io/Gen/dev/
The user first has to install Julia and then install the Gen package by invoking the Julia package manager.
pkg> add https://github.com/probcomp/Gen
The source code, documentation and research paper of Gen is freely publicly available.