UC Berkeley EECS Technical ReportsThe UC Berkeley EECS Technical Memorandum Series provides a dated archive of EECS research. It includes Ph.D. theses and master's reports as well as technical documents that complement traditional publication media such as journals. For example, technical reports may document work in progress, early versions of results that are eventually published in more traditional media, and supplemental information such as long proofs, software documentation, code listings, or elaborated examples.http://www.eecs.berkeley.edu/Pubs/TechRpts/2014-11-01T04:07:06Z2014-11-01T04:07:06ZenBounds on the Energy Consumption of Computational KernelsAndrew Gearharthttp://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-175.html2014-10-23T07:00:00Z2014-10-23T07:00:00Z<p>Bounds on the Energy Consumption of Computational Kernels</p>
<p>
Andrew Gearhart</p>
<p>
EECS Department<br>
University of California, Berkeley<br>
Technical Report No. UCB/EECS-2014-175<br>
October 23, 2014</p>
<p>
<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-175.pdf">http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-175.pdf</a></p>
<p>As computing devices evolve with successive technology generations, many machines target either the mobile or high-performance computing/datacenter environments. In both of these form factors, energy consumption often represents the limiting factor on hardware and software efficiency. On mobile devices, limitations in battery technology may reduce possible hardware capability due to a tight energy budget. On the other hand, large machines such as datacenters and supercomputers have budgets directly related to energy consumption and small improvements in energy efficiency can significantly reduce operating costs. Such challenges have influenced research upon the impact of applications, operating and runtime systems upon energy consumption. Until recently, little consideration was given to the potential energy efficiency of algorithms themselves.
<p>A dominant idea within the high-performance computing (HPC) community is that applications can be decomposed into a set of key computational problems, called kernels. Via automatic performance tuning and new algorithms for many kernels, researchers have successfully demonstrated performance improvements on a wide variety of machines. Motivated by the large and increasingly growing dominant cost (in time and energy) of moving data, algorithmic improvements have been attained by proving lower bounds on the data movement required to solve a computational problem, and then developing communication-optimal algorithms that attain these bounds. </p>
<p>This thesis extends previous research on communication bounds and computational kernels by presenting bounds on the energy consumption of a large class of algorithms. These bounds apply to sequential, distributed parallel and heterogeneous machine models and we detail methods to further extend these models to larger classes of machines. We argue that the energy consumption of computational kernels is usually predictable and can be modeled via linear models with a handful of terms. Thus, these energy models (and the accompanying bounds) may apply to many HPC applications when used in composition. </p>
<p>Given energy bounds, we analyze the implications of such results under additional constraints, such as an upper bound on runtime, and also suggest directions for future research that may aid future development of a hardware/software co-tuning process. Further, we present a new model of energy efficiency, Cityscape, that allows hardware designers to quickly target areas for improvement in hardware attributes. We believe that combining our bounds with other models of energy consumption may provide a useful method for such co-tuning; i.e. to enable algorithm and hardware architects to develop provably energy-optimal algorithms on customized hardware platforms.</p></p>
<p><strong>Advisor:</strong> James Demmel and Tarek I. Zohdi</p>2014-10-23T07:00:00ZFast 4D Sheared Filtering for Interactive Rendering of Distribution EffectsLing-Qi YanSoham Uday MehtaRavi RamamoorthiFredo Durandhttp://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-174.html2014-10-23T07:00:00Z2014-10-23T07:00:00Z<p>Fast 4D Sheared Filtering for Interactive Rendering of Distribution Effects</p>
<p>
Ling-Qi Yan, Soham Uday Mehta, Ravi Ramamoorthi and Fredo Durand</p>
<p>
EECS Department<br>
University of California, Berkeley<br>
Technical Report No. UCB/EECS-2014-174<br>
October 23, 2014</p>
<p>
<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-174.pdf">http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-174.pdf</a></p>
<p>Soft shadows, depth of field, and diffuse global illumination are common distribution effects, usually rendered by Monte Carlo ray tracing. Physically correct, noise-free images can require hundreds or thousands of ray samples per pixel, and take a long time to compute. Recent approaches have exploited sparse sampling and filtering; the filtering is either fast (axis-aligned), but requires more input samples, or needs fewer input samples but is very slow (sheared). We present a new approach for fast sheared filtering on the GPU. Our algorithm factors the 4D sheared filter into four 1D filters. We derive complexity bounds for our method, showing that the per-pixel complexity is reduced from O(n^2 l^2) to O(nl), where n is the linear filter width (filter size is O(n^2)) and l is the (usually very small) number of samples for each dimension of the light or lens per pixel (spp is l^2). We thus reduce sheared filtering overhead dramatically. We demonstrate rendering of depth of field, soft shadows and diffuse global illumination at interactive speeds. We reduce the number of samples needed by 5 − 8×, compared to axis-aligned filtering, and framerates are 4× faster for equal quality.</p>2014-10-23T07:00:00ZMultiSE: Multi-Path Symbolic Execution using Value SummariesKoushik SenGeorge NeculaLiang GongPhilip Wontae Choihttp://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-173.html2014-10-17T07:00:00Z2014-10-17T07:00:00Z<p>MultiSE: Multi-Path Symbolic Execution using Value Summaries</p>
<p>
Koushik Sen, George Necula, Liang Gong and Philip Wontae Choi</p>
<p>
EECS Department<br>
University of California, Berkeley<br>
Technical Report No. UCB/EECS-2014-173<br>
October 17, 2014</p>
<p>
<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-173.pdf">http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-173.pdf</a></p>2014-10-17T07:00:00ZA Longitudinal and Cross-Dataset Study of Internet Latency and Path StabilityMosharaf ChowdhuryRachit AgarwalVyas SekarIon Stoicahttp://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-172.html2014-10-11T07:00:00Z2014-10-11T07:00:00Z<p>A Longitudinal and Cross-Dataset Study of Internet Latency and Path Stability</p>
<p>
Mosharaf Chowdhury, Rachit Agarwal, Vyas Sekar and Ion Stoica</p>
<p>
EECS Department<br>
University of California, Berkeley<br>
Technical Report No. UCB/EECS-2014-172<br>
October 11, 2014</p>
<p>
<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-172.pdf">http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-172.pdf</a></p>
<p>We present a retrospective and longitudinal study of Internet latency and path stability using three large-scale traceroute datasets collected over several years: Ark and iPlane from 2008 to 2013 and a proprietary CDN’s traceroute dataset spanning 2012 and 2013. Using these different “lenses”, we revisit classical properties of Internet paths such as end-to- end latency, stability, and of routing graph structure. Iterative data analysis at this scale is challenging given the idiosyncrasies of different collection tools, measurement noise, and the diverse analysis we desire. To this end, we leverage re- cent big-data techniques to develop a scalable data analysis toolkit, Hummus, that enables rapid and iterative analysis on large traceroute measurement datasets. Our key findings are: (1) overall latency seems to be decreasing; (2) some geographical regions still have poor latency; (3) route stability (prevalence and persistence) is increasing; and (4) we ob- serve a mixture of effects in the routing graph structure with high-degree ASes rapidly increasing in degree and lower- degree ASes forming denser “communities”.</p>2014-10-11T07:00:00ZTypeDevil: Dynamic Type Inconsistency Analysis for JavaScriptMichael PradelParker SchuhKoushik Senhttp://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-171.html2014-10-07T07:00:00Z2014-10-07T07:00:00Z<p>TypeDevil: Dynamic Type Inconsistency Analysis for JavaScript</p>
<p>
Michael Pradel, Parker Schuh and Koushik Sen</p>
<p>
EECS Department<br>
University of California, Berkeley<br>
Technical Report No. UCB/EECS-2014-171<br>
October 7, 2014</p>
<p>
<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-171.pdf">http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-171.pdf</a></p>
<p>Dynamic languages, such as JavaScript, give programmers the freedom to ignore types, and enable them to write concise code in short time. Despite this freedom, many programs follow implicit type rules, for example, that a function has a particular signature or that a property has a particular type. Violations of such implicit type rules often correlate with problems in the program. This paper presents TypeDevil, a mostly dynamic analysis that warns developers about inconsistent types. The key idea is to assign a set of observed types to each variable, property, and function, to merge types based in their structure, and to warn developers about variables, properties, and functions that have inconsistent types. To deal with the pervasiveness of polymorphic behavior in real-world JavaScript programs, we present a set of techniques to remove spurious warnings and to merge related warnings. Applying TypeDevil to widely used benchmark suites and real-world web applications reveals 15 problematic type inconsistencies, including correctness problems, performance problems, and dangerous coding practices.</p>2014-10-07T07:00:00ZProvably Efficient Algorithms for Numerical Tensor AlgebraEdgar Solomonikhttp://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-170.html2014-09-30T07:00:00Z2014-09-30T07:00:00Z<p>Provably Efficient Algorithms for Numerical Tensor Algebra</p>
<p>
Edgar Solomonik</p>
<p>
EECS Department<br>
University of California, Berkeley<br>
Technical Report No. UCB/EECS-2014-170<br>
September 30, 2014</p>
<p>
<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-170.pdf">http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-170.pdf</a></p>
<p>This thesis targets the design of parallelizable algorithms and communication-efficient parallel schedules for numerical linear algebra as well as computations with higher-order tensors. Communication is a growing bottleneck in the execution of most algorithms on parallel computers, which manifests itself as data movement both through the network connecting different processors and through the memory hierarchy of each processor as well as synchronization between processors. We provide a rigorous theoretical model of communication and derive lower bounds as well as algorithms in this model. Our analysis concerns two broad areas of linear algebra and of tensor contractions. We demonstrate the practical quality of the new theoretically-improved algorithms by presenting results which show that our implementations outperform standard libraries and traditional algorithms.
<p>We model the costs associated with local computation, communication, and synchronization. We introduce a new technique for deriving lower bounds on tradeoffs between these costs and apply them to algorithms in both dense and sparse linear algebra as well as graph algorithms. These lower bounds are attained by what we refer to as 2.5D algorithms, which we give for matrix multiplication, Gaussian elimination, QR factorization, the symmetric eigenvalue problem, and the Floyd-Warshall all-pairs shortest-paths algorithm. 2.5D algorithms achieve lower interprocessor bandwidth cost by exploiting auxiliary memory. Algorithms employing this technique are well known for matrix multiplication, and have been derived in the BSP model for LU and QR factorization, as well as the Floyd-Warshall algorithm. We introduce alternate versions of LU and QR algorithms which have measurable performance improvements over their BSP counterparts, and we give the first evaluations of their performance. For the symmetric eigenvalue problem, we give the first 2.5D algorithms, additionally solving challenges with memory-bandwidth efficiency that arise for this problem. We also give a new memory-bandwidth efficient algorithm for Krylov subspace methods (repeated multiplication of a vector by a sparse-matrix). </p>
<p>The latter half of the thesis contains algorithms for higher-order tensors, in particular tensor contractions. We introduce Cyclops Tensor Framework, which provides an automated mechanism for network-topology-aware decomposition and redistribution of tensor data. It leverages 2.5D matrix multiplication to perform tensor contractions communication-efficiently. The framework is capable of exploiting symmetry and antisymmetry in tensors and utilizes a distributed packed-symmetric storage format. Finally, we consider a theoretically novel technique for exploiting tensor symmetry to lower the number of multiplications necessary to perform a contraction via computing some redundant terms that allow preservation of symmetry and then cancelling them out with low-order cost. We analyze the numerical stability and communication efficiency of this technique and give adaptations to antisymmetric and Hermitian matrices. This technique has promising potential for accelerating coupled-cluster (electronic structure) methods both in terms of computation and communication cost.</p></p>
<p><strong>Advisor:</strong> James Demmel</p>2014-09-30T07:00:00ZHigh Performance Machine Learning through Codesign and RoofliningHuasha ZhaoJohn F. Cannyhttp://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-169.html2014-09-27T07:00:00Z2014-09-27T07:00:00Z<p>High Performance Machine Learning through Codesign and Rooflining</p>
<p>
Huasha Zhao and John F. Canny</p>
<p>
EECS Department<br>
University of California, Berkeley<br>
Technical Report No. UCB/EECS-2014-169<br>
September 27, 2014</p>
<p>
<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-169.pdf">http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-169.pdf</a></p>
<p>Machine learning (ML) is a cornerstone of the new data revolution. Most attempts to scale machine learning to massive datasets focus on parallelization on computer clusters. The BIDMach project instead explores the untapped potential (especially from GPU and SIMD hardware) inside individual machines. Through careful codesign of algorithms and ``rooflining'', we have demonstrated multiple orders of magnitude speedup over other systems. In fact, BIDMach running on a single machine exceeds the performance of cluster systems on most common ML tasks, and has run computer-intensive tasks on 10-terabyte datasets. We can further show that BIDMach runs at close to the theoretical limits imposed by CPU/GPU, memory or network bandwidth. BIDMach includes several innovations to make the data modeling process more agile and effective: likelihood ``mixins'' and interactive modeling using Gibbs sampling.
<p>These results are very encouraging but the greatest potential for future hardware-leveraged machine learning appears to be on MCMC algorithms: We can bring the performance of sample-based Bayesian inference up close to symbolic methods. This opens the possibility for a general-purpose ``engine'' for machine learning whose performance matches specialized methods. We demonstrate this approach on a specific problem (Latent Dirichlet Allocation), and discuss the general case. </p>
<p>Finally we explore scaling ML to clusters. In order to benefit from parallelization, rooflined nodes require very high network bandwidth. We show that the aggregators (reducers) on other systems do not scale, and are not adequate for this task. We describe two new approaches, butterfly mixing and ``Kylix'' which cover the requirements of machine learning and graph algorithms respectively. We give roofline bounds for both approaches.</p></p>
<p><strong>Advisor:</strong> John F. Canny</p>2014-09-27T07:00:00ZA Hybrid Dynamical Systems Theory for Legged LocomotionSamuel Burdenhttp://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-167.html2014-09-25T07:00:00Z2014-09-25T07:00:00Z<p>A Hybrid Dynamical Systems Theory for Legged Locomotion</p>
<p>
Samuel Burden</p>
<p>
EECS Department<br>
University of California, Berkeley<br>
Technical Report No. UCB/EECS-2014-167<br>
September 25, 2014</p>
<p>
<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-167.pdf">http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-167.pdf</a></p>
<p>Legged locomotion arises from intermittent contact between limbs and terrain. Since it emerges from a closed-loop interaction, reductionist study of body mechanics and terrestrial dynamics in isolation have failed to yield comprehensive strategies for forward- or reverse-engineering locomotion. Progress in locomotion science stands to benefit a diverse array of engineers, scientists, and clinicians working in robotics, neuromechanics, and rehabilitation. Eschewing reductionism in favor of a holistic study, we seek a systems-level theory tailored to the dynamics of legged locomotion.
<p>Parsimonious mathematical models for legged locomotion are hybrid, as the system state undergoes continuous flow through limb stance and swing phases punctuated by instantaneous reset at discrete touchdown and liftoff events. In their full generality, hybrid systems can exhibit properties such as nondeterminism and orbital instability that are inconsistent with observations of organismal biomechanics. By specializing to a class of intrinsically self-consistent dynamical models, we exclude such pathologies while retaining emergent phenomena that arise in closed-loop studies of locomotion. </p>
<p>Beginning with a general class of hybrid control systems, we construct an intrinsic state-space metric and derive a provably-convergent numerical simulation algorithm. This resolves two longstanding problems in hybrid systems theory: non-trivial comparison of states from distinct discrete modes, and accurate simulation up to and including Zeno events. Focusing on models for periodic gaits, we prove that isolated discrete transitions generically lead the hybrid dynamical system to reduce to an equivalent classical (smooth) dynamical system. This novel route to reduction in models of rhythmic phenomena demonstrates that the closed-loop interaction between limbs and terrain is generally simpler than either taken in isolation. Finally, we show that the non-smooth flow resulting from arbitrary footfall timing possesses a non-classical (Bouligand) derivative. This provides a foundation for design and control of multi-legged maneuvers. Taken together, these contributions yield a unified analytical and computational framework -- a hybrid dynamical systems theory -- applicable to legged locomotion.</p></p>
<p><strong>Advisor:</strong> S. Shankar Sastry</p>2014-09-25T07:00:00ZA Learning Based Approach to Control Synthesis of Markov Decision Processes for Linear Temporal Logic SpecificationsDorsa SadighEric KimSamuel CooganS. Shankar SastrySanjit A. Seshiahttp://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-166.html2014-09-20T07:00:00Z2014-09-20T07:00:00Z<p>A Learning Based Approach to Control Synthesis of Markov Decision Processes for Linear Temporal Logic Specifications</p>
<p>
Dorsa Sadigh, Eric Kim, Samuel Coogan, S. Shankar Sastry and Sanjit A. Seshia</p>
<p>
EECS Department<br>
University of California, Berkeley<br>
Technical Report No. UCB/EECS-2014-166<br>
September 20, 2014</p>
<p>
<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-166.pdf">http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-166.pdf</a></p>
<p>We propose to synthesize a control policy for a Markov decision process (MDP) such that the resulting traces of the MDP satisfy a linear temporal logic (LTL) property. We construct a product MDP that incorporates a deterministic Rabin automaton generated from the desired LTL property. The reward function of the product MDP is defined from the acceptance condition of the Rabin automaton. This construction allows us to apply techniques from learning theory to the problem of synthesis for LTL specifications even when the transition probabilities are not known a priori. We prove that our method is guaranteed to find a controller that satisfies the LTL property with probability one if such a policy exists, and we suggest empirically with a case study in traffic control that our method produces reasonable control strategies even when the LTL property cannot be satisfied with probability one.</p>2014-09-20T07:00:00ZAccuracy of the s-step Lanczos method for the symmetric eigenproblemErin CarsonJames Demmelhttp://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-165.html2014-09-17T07:00:00Z2014-09-17T07:00:00Z<p>Accuracy of the s-step Lanczos method for the symmetric eigenproblem</p>
<p>
Erin Carson and James Demmel</p>
<p>
EECS Department<br>
University of California, Berkeley<br>
Technical Report No. UCB/EECS-2014-165<br>
September 17, 2014</p>
<p>
<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-165.pdf">http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-165.pdf</a></p>
<p>The $s$-step Lanczos method is an attractive alternative to the classical Lanczos method as it enables an $O(s)$ reduction in data movement over a fixed number of iterations. This can significantly improve performance on modern computers. In order for $s$-step methods to be widely adopted, it is important to better understand their error properties. Although the $s$-step Lanczos method is equivalent to the classical Lanczos method in exact arithmetic, empirical observations demonstrate that it can behave quite differently in finite precision.
<p>In this paper, we demonstrate that bounds on accuracy for the finite precision Lanczos method given by Paige [\emph{Lin. Alg. Appl.}, 34:235--258, 1980] can be extended to the $s$-step Lanczos case assuming a bound on the condition numbers of the computed $s$-step bases. Our results confirm theoretically what is well-known empirically: the conditioning of the Krylov bases plays a large role in determining finite precision behavior. In particular, if one can guarantee that the basis condition number is not too large throughout the iterations, the accuracy and convergence of eigenvalues in the $s$-step Lanczos method should be similar to those of classical Lanczos. This indicates that, under certain restrictions, the $s$-step Lanczos method can be made suitable for use in many practical cases.</p></p>2014-09-17T07:00:00ZPtolemy Coding StyleChristopher BrooksEdward A. Leehttp://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-164.html2014-09-05T07:00:00Z2014-09-05T07:00:00Z<p>Ptolemy Coding Style</p>
<p>
Christopher Brooks and Edward A. Lee</p>
<p>
EECS Department<br>
University of California, Berkeley<br>
Technical Report No. UCB/EECS-2014-164<br>
September 5, 2014</p>
<p>
<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-164.pdf">http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-164.pdf</a></p>
<p>Collaborative software projects benefit when participants read code created by other participants. The objective of a coding style is to reduce the fatigue induced by unimportant formatting differences and differences in naming conventions. Although individual programmers will undoubtedly have preferences and habits that differ from the recommendations here, the benefits that flow from following these recommendations far outweigh the inconveniences. Published papers in journals are subject to similar stylistic and layout constraints, so such constraints are not new to the academic community. This document describes the coding style used in Ptolemy II, a package with 550K lines of Java and 160 contributing programmers that has been under development since 1996.</p>2014-09-05T07:00:00ZSystem Design Trade-Offs in a Next-Generation Embedded Wireless PlatformMichael P AndersenDavid E. Cullerhttp://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-162.html2014-08-25T07:00:00Z2014-08-25T07:00:00Z<p>System Design Trade-Offs in a Next-Generation Embedded Wireless Platform</p>
<p>
Michael P Andersen and David E. Culler</p>
<p>
EECS Department<br>
University of California, Berkeley<br>
Technical Report No. UCB/EECS-2014-162<br>
August 25, 2014</p>
<p>
<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-162.pdf">http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-162.pdf</a></p>
<p>Over the course of the past decade, the evolution of ad- vanced low-energy microcontrollers has raised three ques- tions which this paper outlines and addresses. The first question is: Can a 32-bit platform be constructed that provides advanced features but fits within the energy constraints of a wireless sensor network? We answer this in the affirmative by presenting the design and preliminary evaluation of Storm – one such system based on an ARM Cortex-M4 that achieves 2.3μA idle current with a 1.5μS wake up time. The second question we answer is: Can this platform simultaneously meet the very different demands of both monitoring-type applications and cyber-physical systems? We demonstrate that this is indeed possible and present the design trade-offs that must be made to achieve this, yielding a module with a rich set of exported peripherals that fits in a 16mm x 26mm form factor. The final question explored by this paper is: If such a platform is possible, what new opportunities and challenges would it hold for embedded operating systems? We answer this by showing that the usage of modern 32 bit microcon- trollers requires reconsidering system architecture govern- ing power management, clock selection and inter-module de- pendencies, as well as offering opportunities for supervisory code and the coordination of common tasks without CPU in- tervention.</p>2014-08-25T07:00:00ZProgramming by Manipulation for LayoutThibaud HottelierRas BodikKimiko Ryokaihttp://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-161.html2014-08-25T07:00:00Z2014-08-25T07:00:00Z<p>Programming by Manipulation for Layout</p>
<p>
Thibaud Hottelier, Ras Bodik and Kimiko Ryokai</p>
<p>
EECS Department<br>
University of California, Berkeley<br>
Technical Report No. UCB/EECS-2014-161<br>
August 25, 2014</p>
<p>
<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-161.pdf">http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-161.pdf</a></p>
<p>We present Programming by Manipulation, a new programming methodology for specifying the layout of data visualizations, targeted at non-programmers. We address the two central sources of bugs that arise when programming with constraints: ambiguities and conflicts (inconsistencies). We rule out conflicts by design and exploit ambiguity to explore possible layout designs. Our users design layouts by highlighting undesirable aspects of a current design, effectively breaking spurious constraints and introducing ambiguity by giving some elements freedom to move or resize. Subsequently, the tool indicates how the ambiguity can be removed, by computing how the free elements can be fixed with available constraints. To support this workflow, our tool computes the ambiguity and summarizes it visually. We evaluate our work with two user-studies demonstrating that both non-programmers and programmers can effectively use our prototype. Our results suggest that our tool is 5-times more productive than direct programming with constraints.</p>2014-08-25T07:00:00ZDynamic and Interactive Synthesis of Code SnippetsJoel Galensonhttp://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-160.html2014-08-20T07:00:00Z2014-08-20T07:00:00Z<p>Dynamic and Interactive Synthesis of Code Snippets</p>
<p>
Joel Galenson</p>
<p>
EECS Department<br>
University of California, Berkeley<br>
Technical Report No. UCB/EECS-2014-160<br>
August 20, 2014</p>
<p>
<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-160.pdf">http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-160.pdf</a></p>
<p>Many code fragments are difficult to write. For example, using new and unfamiliar APIs can be a complex task with a steep learning curve. In addition, implementing a complex data structure requires discovering and understanding all of the corner cases. And more and more end users with little to no formal training are trying to write code, whether they be scientists writing simulations or kids writing mobile apps. For all of these reasons and more, programming is a difficult task, which leads to bugs and delays in software.
<p>There are many tools that help programmers find code fragments involving complex APIs, but many are somewhat inexpressive and rely on static information. We present a new technique, which we call CodeHint, that generates and evaluates code at runtime and hence can synthesize real-world Java code that involves I/O, reflection, native calls, and other advanced language features. Our approach is dynamic (giving accurate results and allowing programmers to reason about concrete executions), easy-to-use (supporting a wide range of correctness specifications), and interactive (allowing users to refine the candidate code snippets). We evaluate CodeHint and show that its algorithms are efficient and that in two user studies it improves programmer productivity by more than a factor of two. </p>
<p>As the second contribution, programmers and end users often find it easy to explain an algorithm on a whiteboard or with pictures in a textbook but struggle to write the code correctly. We propose a new methodology that allows users to program by demonstrating how an algorithm proceeds on concrete inputs. To reduce the burden of these demonstrations on the user, we have developed pruning algorithms to remove ambiguities in the demonstrations and control flow inference algorithms to infer missing conditionals in demonstrations. These two techniques take advantage of the knowledge encoded in the user's partial correctness condition. We show that this approach is effective in practice by analyzing its performance on several common algorithms.</p></p>
<p><strong>Advisor:</strong> Ras Bodik and Koushik Sen</p>2014-08-20T07:00:00ZEnabling Portable Building Applications through Automated Metadata TransformationArka BhattacharyaDavid E. CullerJorge OrtizDezhi HongKamin Whitehousehttp://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-159.html2014-08-19T07:00:00Z2014-08-19T07:00:00Z<p>Enabling Portable Building Applications through Automated Metadata Transformation</p>
<p>
Arka Bhattacharya, David E. Culler, Jorge Ortiz, Dezhi Hong and Kamin Whitehouse</p>
<p>
EECS Department<br>
University of California, Berkeley<br>
Technical Report No. UCB/EECS-2014-159<br>
August 19, 2014</p>
<p>
<a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-159.pdf">http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-159.pdf</a></p>
<p>Sensor network research has facilitated advancements in various domains, such as industrial monitoring, environmental sensing, etc., and research challenges have shifted from creating infrastructure to utilizing it. Commercial buildings provide a valuable setting for investigating automated metadata acquisition and augmentation, as they typically comprise large sensor networks, but have limited, obscure 'tags' that are often meaningful only to the facility managers. Moreover, this primitive metadata is imprecise and varies across vendors and deployments. Extracting meaningful information from a building's sensor data, or control applications using the data, depends on the metadata available to interpret it, whether provided by novel networks or legacy instrumentation.
<p>This state-of-the-art is a fundamental barrier to scaling analytics or intelligent control across the building stock, as even the basic steps involve labor intensive manual efforts by highly trained consultants. Writing building applications on its sensor network remains largely intractable as it involves extensive help from an expert in each building's design and operation to identify the sensors of interest and create the associated metadata. This process is repeated for each application development in a particular building, and across different buildings. This results in customized building-specific application queries which are not portable or scalable across buildings. </p>
<p>We present a synthesis technique that learns how to transform a building's primitive sensor metadata to a common namespace by using a small number of examples from an expert, such as the building manager. Once the transformation rules are learned for one building, it can be applied across buildings with a similar metadata structure. This common and understandable namespace can enable analytics applications that do not require apriori building-specific knowledge. </p>
<p>Initial results show that learning the rules to transform 70% of the primitive metadata of two buildings (with completely different metadata structure), comprising 1600 and 2600 sensors, into a common namespace took only 21 and 27 examples respectively. The learned rules were able to transform similar primitive metadata in about 60 other buildings as well, enabling writing of portable applications across these buildings.</p></p>2014-08-19T07:00:00Z