Electrical Engineering
      and Computer Sciences

Electrical Engineering and Computer Sciences

COLLEGE OF ENGINEERING

UC Berkeley

Infusing Parallelism into Introductory Computer Science Curriculum using MapReduce

Matthew Johnson, Robert H. Liao, Alexander Rasmussen, Ramesh Sridharan, Dan Garcia and Brian K. Harvey

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2008-34
April 10, 2008

http://www.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-34.pdf

We have incorporated cluster computing fundamentals into the introductory computer science curriculum at UC Berkeley. For the first course, we have developed coursework and programming problems in Scheme centered around Google¿s MapReduce. To allow students only familiar with Scheme to write and run MapReduce programs, we designed a functional interface in Scheme and implemented software to allow tasks to be run in parallel on a cluster. The streamlined interface enables students to focus on programming to the essence of the MapReduce model and avoid the potentially cumbersome details in the MapReduce implementation, and so it delivers a clear pedagogical advantage. The interface¿s simplicity and purely functional treatment allows students to tackle data-parallel problems after the first two-thirds of the first introductory course. In this paper we describe the system implementation to interface our Scheme interpreter with a cluster running Hadoop (a Java-based MapReduce implementation). Our design can serve as a prototype for other such interfaces in educational environments that do not use Java and therefore cannot simply use Hadoop. We also outline the MapReduce exercises we have introduced to our introductory course, which allow students in an introductory programming class to begin to work with data-parallel programs and designs.


BibTeX citation:

@techreport{Johnson:EECS-2008-34,
    Author = {Johnson, Matthew and Liao, Robert H. and Rasmussen, Alexander and Sridharan, Ramesh and Garcia, Dan and Harvey, Brian K.},
    Title = {Infusing Parallelism into Introductory Computer Science Curriculum using MapReduce},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {2008},
    Month = {Apr},
    URL = {http://www.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-34.html},
    Number = {UCB/EECS-2008-34},
    Abstract = {We have incorporated cluster computing fundamentals into the introductory computer science curriculum at UC Berkeley. For the first course, we have developed coursework and programming problems in Scheme centered around Google¿s MapReduce. To allow students only familiar with Scheme to write and run MapReduce programs, we designed a functional interface in Scheme and implemented software to allow tasks to be run in parallel on a cluster. The streamlined interface enables students to focus on programming to the essence of the MapReduce model and avoid the potentially cumbersome details in the MapReduce implementation, and so it delivers a clear pedagogical advantage. 

The interface¿s simplicity and purely functional treatment allows students to tackle data-parallel problems after the first two-thirds of the first introductory course. 

In this paper we describe the system implementation to interface our Scheme interpreter with a cluster running Hadoop (a Java-based MapReduce implementation). Our design can serve as a prototype for other such interfaces in educational environments that do not use Java and therefore cannot simply use Hadoop. We also outline the MapReduce exercises we have introduced to our introductory course, which allow students in an introductory programming class to begin to work with data-parallel programs and designs.}
}

EndNote citation:

%0 Report
%A Johnson, Matthew
%A Liao, Robert H.
%A Rasmussen, Alexander
%A Sridharan, Ramesh
%A Garcia, Dan
%A Harvey, Brian K.
%T Infusing Parallelism into Introductory Computer Science Curriculum using MapReduce
%I EECS Department, University of California, Berkeley
%D 2008
%8 April 10
%@ UCB/EECS-2008-34
%U http://www.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-34.html
%F Johnson:EECS-2008-34