Implementing MapReduce function in Elixir

0
65
MapReduce function

MapReduce is a well known pattern which is widely used in Big Data for the analysis of a concurrent data set. In the world of functional programming, Map and Reduce are two higher order functions. Map function takes a list and lambda or an anonymous function as its arguments and applies the provided function to each element present in the supplied list, and returns a new list as output given by the lambda function. Reduce function works similar to Map function in terms of accepting the same arguments i.e. a list and lambda function along with an additional argument in Elixir i.e. an accumulator, and returns an accumulated value instead of list. In this article, we are going to learn the concurrency and MapReduce as powerful Elixir’s features along with a suitable example that demonstrate MapReduce feature.

Map Reduce example – A word count application

Before, we start working on Elixir’s Map reduce example it is expected that your system should have Elixir installed. Elixir installation steps and kick start example was discussed in the last article. Let’s create a new project for building word count application by using the following syntax.

1

mix new mapreduceexp –module MapReduce

The above syntax will create your Mix project with project directory mapreduceexp [C:\Users\xyz\mapreduceexp].

mapreduceexp directory

In the Next step, you need to navigate into your mapreduceexp directory and edit mix.exs with the line inside def project do [] as shown below.

MapReduce function Flow

MapReduce in Elixir can be demonstrated as a pipeline through the data flows in 5 steps. These 5 steps correspond to 5 modules which will be written in Elixir programming language.
Input Reader: The step 1 is the creation of input reader in Elixir that reads data as input and split the data into a format that could be easily read by our Map process and simultaneously launches Map processes. Code implementation is given below.

Map process: The Map process reads the formatted data from input reader and executes a function on each piece of supplied data to provide a key value pair to a Partition or Compare process as output. Code implementation is given below.

Partition or Compare process: The Partition process accepts the accumulated key value pairs supplied by the Map process in order to compare these pairs and issues Reduce processes for each unique key. Code implementation is given below.

  • Reduce process: The Reduce function executes a function on each value which adds up all the values for the given key in order to emit these values to the Output Writer. Code implementation is given below.

  • Output writer: The output writer accepts the process data from Reduce process in the pipeline and yields the processed data in the desired format. Code implementation is given below.

  • Main function to call modules and functions: In this step, you need to navigate into C:\Users\xyz\mapreduceexp\lib project directory and open mapreduceexp. ex file. Here first of you need to add the following imports to modules [InputReaderModule, PartitionModule] as shown below. Next, define the main function [def main(args) do], pipeline function when no file is given, pipeline function with options arguments, and parse_args function. All of these functions are given below as a part of code implementation.

Building and Executing Application

At this point our app development demonstrating MapReduce has completed, next we need to issue command [mix escript.build ] in order to build the application. The application will generate an app file named mapreduceexp as shown below.

Prompt Command

Next, we can execute this app file after issuing the command [mix run mapreduceexp –file=text\inputtext.txt]to observe the Map Reduce output for word count.

Conclusion
In this article, we discussed about MapReduce function implementation in Elixir along with a suitable example

LEAVE A REPLY

Please enter your comment!
Please enter your name here