17.7 C
New York
Thursday, September 29, 2022

SC21 Panel on Programming Models – Tackling Data Movement, DSLs, More – HPCwire

Since 1987 – Covering the Fastest Computers in the World and the People Who Run Them
Since 1987 – Covering the Fastest Computers in the World and the People Who Run Them
By John Russell
January 6, 2022
How will programming future systems differ from current practice? This is an ever-present question in computing. Yet it has, perhaps, never been more pressing given the rise of heterogeneous architectures and diverse hardware, the steady incorporation of AI technology, and the proliferation of new programming languages and models.
At SC21, a distinguished panel tackled this broad question. Higher levels of abstraction, a clearer focus on data movement – not compute functions – and the rise of domain-specific languages as important tools were among the dominant points of discussion, which touched on topics as diverse programming Cerebras’s wafer-scale chip to FPGAs.
Moderated by Hal Finkel (DOE), the panelists included Kathy Yelick (UC Berkeley), Saman Amarasinghe (MIT), Torsten Hoefler (ETH Zürich), Maya Gokhale (LLBNL) and Justin Gottschlich (Intel). Capturing the full discussion is too daunting, but each panelist made an opening statement that captures (at least directionally) much of their thinking. Presented here are brief portions (lightly edited) of panelists’ opening remarks.
Yelick, who just assumed her new role as vice chancellor of research at LBNL, kicked off the panel saying, “[In] scientific computing, in general, I think we should think about how people are programming at much higher level of abstraction than we’re used to. I think if you look at machine learning, and the packages that people have built for machine learning, they’ve really shown that you can, with a lot of work in terms of how you implement some of those underlying algorithms, get very good performance out of those.
“That opens up HPC-type of access to a much broader community of people if they can program at the level of something like TensorFlow. And I’d like people to also think a little bit about systems like Julia and Jupyter notebooks as really the interface to the computers, rather than thinking about programming and languages based on things like C/C++ or Fortran. So really, I’m going to be advocating for a much higher level of abstraction, which is not to say that some of us won’t still be programming at a much lower level.”
Next up was Amarasinghe, who leads the compiler research group in MIT’s Computer Science & Artificial Intelligence Laboratory (CSAIL). A leader in the field of high-performance domain-specific languages, Amarasinghe’s group developed the Halide, TACO, Simit, and many other domain-specific languages and compilers,
“If you think about domain-specific languages, [it’s] not too much of a stretch – even if you say you are a C programmer, or Fortran programmer or Python programmer – to say nobody writes loops and arrays and low level things in these languages. We all use libraries. All the systems are based on libraries and that means you’re already programming in higher level abstraction with one caveat. These libraries don’t have understanding of how the entire thing is connected together. So, when you call a library function, it’s a standalone thing; it will do what’s asked and return,” he said.
“What a domain-specific language or domain-specific compiler does is, it can figure out the control flow between these library calls, understand how these things get stitched together and use that to begin to optimize performance. This is especially important now and for the future, because memory systems and data movement are becoming a really important issue,” said Amarasinghe.
Perhaps the most forceful champion for focusing on data movement in future programming development was Hoefler, who directs the Scalable Parallel Computing Laboratory (SPCL) at ETH Zurich. He argued counting FLOPS, as is done in ranking the Top500, misses the point in modern computing.
Commenting on the use of new large models such as GPT-3, he said, “Many companies are spending 10s of millions of dollars to train these models, and these are real HPC problems. They are the largest models people have trained [and] very much [what] we care about. We actually analyzed the workload a little bit more in detail. We found that the 99.8 percent of the floating-point operations in this workload is actually comprised of Tensor contractions [and] Tensor contractions are all expressed as matrix multiplication.
“So, this is wonderful, isn’t it? 99.8 percent of this workload is matrix multiplication. But if you actually look at the remaining 0.2 percent of operations in this workload, [it] turns out those are taking about 40 percent of the runtime. [That’s] because these Tensor contractions have been super highly-optimized over the years. The problem now [that] dominates everything else is data movement. We did some optimizations that I don’t want to go into detail about that show that you can actually speed this up quite significantly, and you can save millions of dollars by just looking at data movement,” said Hoefler.
Gottschlich, who is a principal AI scientist at Intel Labs and the director and founder of the machine programming research group at Intel, noted how Intel’s perspective on programing models has changed.
“When I joined back in 2010, Intel was very much a monolithic computing company, it was just a CPU. As I suspect everyone in the audience knows, we now consider ourselves to be very heterogeneous,” he said. “One of the core challenges we see today is not so much in the compute, but in the data movement. So, I just wanted to quickly acknowledge that I think the data movement, and figuring out how to deal with that, especially as we grow into deeper stochastic systems that tend to be improving their accuracy, as you have more IID data (independent and identically distributed data), that it becomes even more important that we figure out how to handle that that data movement problem.”
“Back in 2018, we published this paper, actually jointly with Saman (Amarasinghe) and some others, on the three pillars of machine programming. Machine programming is principally this idea that we are going to try to automate the development of software, and a byproduct of that is the automation of development of hardware given that much of hardware is developed through software. The three pillars are intention, invention and adaptation. Intention is principally concerned with trying to identify novel ways or improve the existing ways for programmers to specify their ideas to the machine. So, going back to, I think, both Kathy and Saman’s comments about higher order abstractions, and DSLs. In fact, I fully agree with this. I think that as we move forward, I suspect that to get outstanding performance, we really need to have this separation of intention from invention and adaptation. Once the intention is understood by the machine, then we can start to invent the algorithms and data structures that are necessary to fulfill that intention.”


Last to deliver intro remarks was Gokhale, distinguished member of technical staff at LLNL and an expert in reconfigurable computing and data intensive architectures.
“I feel as if we’re in a fix right now with a fusion of programming models and it’s because of scaling laws, which we all know very well, between the feature size and the power. What we’ve done is build specialized widgets, that do a smaller thing, but do it very well rather than a general-purpose thing. That is a cause of a lot of problems. [It’s] one factor that is leading us to a lot of new ideas in programming models, this idea of specialization and putting heterogeneous pieces together,” said Gokhale.
“To me, the future is system-on-chip (OSC) like environments. So, heterogeneous compute models, data and or control-driven, tightly or loosely-coupled. [For example,] if you’ve worked for Apple or worked on cell phones, that SOC environment. I have a background in reconfigurable computing with FPGAs that is the combination of SOC-like environment and higher level programming. It’s a difficult environment to work in, but I see that’s where we’re going. On the other side, I see workflows for programming, [with] model interfacing and mapping. [Often] you think of your favorite DSL; it’s just so elegant and so mathematical. But it has to talk to other pieces of things and how do you make it do that? How do you interoperate? [L]arge HPC workflows have embodied some of those ideas of being able to interface with [DSLs],” she said.
A rich discussion followed the introductory comments and the SC21 video was still posted as of this writing and accessible by SC21 registrants.
More Off The Wire
Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!
April 18, 2022
Getting a glimpse into Nvidia’s R&D has become a regular feature of the spring GTC conference with Bill Dally, chief scientist and senior vice president of research, providing an overview of Nvidia’s R&D organi Read more…
April 16, 2022
Over the course of the pandemic, the crucial role of high-performance computing and artificial intelligence in treating disease has become abundantly clear. Even as viral disease sits squarely in supercomputing’s cross Read more…
April 15, 2022
“HPC Matters!” was the big, bold title of a talk by Piyush Mehrotra, division chief of NASA’s Advanced Supercomputing (NAS) Division at its Ames Research Center, during the meeting of the HPC Advisory Council at St Read more…
April 14, 2022
During a talk for the Ken Kennedy Institute’s 2022 Energy High Performance Computing Conference, Dan Stanzione, executive director of the Texas Advanced Computing Center (TACC), gave a status update on TACC’s forthco Read more…
April 14, 2022
Thanks to Planck’s constant — 4.14 x 10-15 eV seconds, rounded up (or 6.63 x 10-34 Joule seconds, if you prefer) — April 14 has been designated World Quantum Day by a loosely affiliated group of scientists and organizations around the world, including the U.S. (see website). The broad idea is to spotlight quantum information science’s rapid growth and, in particular… Read more…
The Covid-19 pandemic has profoundly changed the world. The remote workplace has become the norm. We have started looking at personal health differently – the way we work, live, play and do business. Read more…
April 13, 2022
Scientists at Brookhaven National Laboratory, Columbia University, the University of Connecticut, University of Edinburgh, Regensburg University, and the University of Southampton are seeking answers to physics mysteries at the highest energies and shortest distances. The team is devising new methods and enhancing their code in order to exploit the huge potential… Read more…
April 18, 2022
Getting a glimpse into Nvidia’s R&D has become a regular feature of the spring GTC conference with Bill Dally, chief scientist and senior vice president of Read more…
April 16, 2022
Over the course of the pandemic, the crucial role of high-performance computing and artificial intelligence in treating disease has become abundantly clear. Eve Read more…
April 15, 2022
“HPC Matters!” was the big, bold title of a talk by Piyush Mehrotra, division chief of NASA’s Advanced Supercomputing (NAS) Division at its Ames Research Read more…
April 14, 2022
During a talk for the Ken Kennedy Institute’s 2022 Energy High Performance Computing Conference, Dan Stanzione, executive director of the Texas Advanced Compu Read more…
April 14, 2022
Thanks to Planck’s constant — 4.14 x 10-15 eV seconds, rounded up (or 6.63 x 10-34 Joule seconds, if you prefer) — April 14 has been designated World Quantum Day by a loosely affiliated group of scientists and organizations around the world, including the U.S. (see website). The broad idea is to spotlight quantum information science’s rapid growth and, in particular… Read more…
April 13, 2022
Scientists at Brookhaven National Laboratory, Columbia University, the University of Connecticut, University of Edinburgh, Regensburg University, and the University of Southampton are seeking answers to physics mysteries at the highest energies and shortest distances. The team is devising new methods and enhancing their code in order to exploit the huge potential… Read more…
April 6, 2022
MLCommons today released its latest MLPerf inferencing results, with another strong showing by Nvidia accelerators inside a diverse array of systems. Roughly fo Read more…
April 5, 2022
The National Health Research Institutes (NHRI) in Taiwan has announced a partnership with Nvidia and Asus to deliver the nation’s first biomedical supercomput Read more…
January 10, 2022
Graphics chip powerhouse Nvidia today announced that it has acquired HPC cluster management company Bright Computing for an undisclosed sum. Unlike Nvidia’s bid to purchase semiconductor IP company Arm, which has been stymied by regulatory challenges, the Bright deal is a straightforward acquisition that aims to expand… Read more…
January 24, 2022
Fresh off its rebrand last October, Meta (née Facebook) is putting muscle behind its vision of a metaversal future with a massive new AI supercomputer called the AI Research SuperCluster (RSC). Meta says that RSC will be used to help build new AI models, develop augmented reality tools, seamlessly analyze multimedia data and more. The supercomputer’s… Read more…
March 8, 2022
AMD/Xilinx has released an improved version of its VCK5000 AI inferencing card along with a series of competitive benchmarks aimed directly at Nvidia’s GPU line. AMD says the new VCK5000 has 3x better performance than earlier versions and delivers 2x TCO over Nvidia T4. AMD also showed favorable benchmarks against several Nvidia GPUs, claiming its VCK5000 achieved… Read more…
February 3, 2022
IBM today announced it will deploy its first quantum computer in Canada, putting Canada on a short list of countries that will have access to an IBM Quantum Sys Read more…
December 27, 2021
Today, the LLVM compiler infrastructure world is essentially inescapable in HPC. But back in the 2000 timeframe, LLVM (low level virtual machine) was just getting its start as a new way of thinking about how to overcome shortcomings in the Java Virtual Machine. At the time, Chris Lattner was a graduate student of… Read more…
January 13, 2022
GPU-maker Nvidia is continuing to try to keep its proposed acquisition of British chip IP vendor Arm Ltd. alive, despite continuing concerns from several governments around the world. In its latest action, Nvidia filed a 29-page response to the U.K. government to point out a list of potential benefits of the proposed $40 billion deal. Read more…
April 18, 2022
Getting a glimpse into Nvidia’s R&D has become a regular feature of the spring GTC conference with Bill Dally, chief scientist and senior vice president of Read more…
February 3, 2022
Just about a month ago, Pfizer scored its second huge win of the pandemic when the U.S. Food and Drug Administration issued another emergency use authorization Read more…
March 22, 2022
The battle for datacenter dominance keeps getting hotter. Today, Nvidia kicked off its spring GTC event with new silicon, new software and a new supercomputer. Speaking from a virtual environment in the Nvidia Omniverse 3D collaboration and simulation platform, CEO Jensen Huang introduced the new Hopper GPU architecture and the H100 GPU… Read more…
April 6, 2022
MLCommons today released its latest MLPerf inferencing results, with another strong showing by Nvidia accelerators inside a diverse array of systems. Roughly fo Read more…
February 18, 2022
Intel held its 2022 investor meeting yesterday, covering everything from the imminent Sapphire Rapids CPUs to the hotly anticipated (and delayed) Ponte Vecchio GPUs. But somewhat buried in its summary of the meeting was a new namedrop: “Falcon Shores,” described as “a new architecture that will bring x86 and Xe GPU together into a single socket.” The reveal was… Read more…
February 8, 2022
Quantum computing pioneer D-Wave today announced plans to go public via a SPAC (special purpose acquisition company) mechanism. D-Wave will merge with DPCM Capital in a transaction expected to produce $340 million in cash and result in a roughly $1.6 billion initial market valuation. The deal is expected to be completed in the second quarter of 2022 and the new company will be traded on the New York Stock… Read more…
January 13, 2021
The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…
March 2, 2022
A new industry consortium aims to establish a die-to-die interconnect standard – Universal Chiplet Interconnect Express (UCIe) – in support of an open chipl Read more…
March 7, 2022
Nvidia has announced that it has acquired Excelero. The high-performance block storage provider, founded in 2014, will have its technology integrated into Nvidia’s enterprise software stack. Nvidia is not disclosing the value of the deal. Excelero’s core product, Excelero NVMesh, offers software-defined block storage via networked NVMe SSDs. NVMesh operates through… Read more…
May 20, 2021
Google CEO Sundar Pichai spoke for only one minute and 42 seconds about the company’s latest TPU v4 Tensor Processing Units during his keynote at the Google I Read more…
© 2022 HPCwire. All Rights Reserved. A Tabor Communications Publication
HPCwire is a registered trademark of Tabor Communications, Inc. Use of this site is governed by our Terms of Use and Privacy Policy.
Reproduction in whole or in part in any form or medium without express written permission of Tabor Communications, Inc. is prohibited.

source

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles