Originally published Sunday, February 15, 2015 at 8:01 PM

Ayasdi CEO Gurjeet Singh uses ‘big math’ to tame big data

Ayasdi’s stated goal is to make the analysis of complex data “as simple as online shopping or searching the Web.” The company’s name is the Cherokee word for “seek.”

By Pete Carey

San Jose Mercury News

PREV of NEXT

Josie Lepe / Bay Area News Group

Gurjeet Singh is shown in the Menlo Park, Calif., office of Ayasdi, which was founded in 2008.

Five things about Gurjeet Singh

1. He began coding at the age of 6, with a ZX Spectrum game computer his father bought for him.

2. He builds robots in his spare time.

3. He’s a huge science-fiction fan.

4. He still writes software even though he’s a CEO.

5. He was accepted to Stanford’s Ph.D. program and came to the U.S. with only enough savings to pay for one quarter.

Source: San Jose Mercury News

Gurjeet Singh

Birthplace: Ludhiana, Punjab, India

Age: 33

Education: B.S. in instrument and control engineering, Netajai Subhas Institute of Technology, 2002; Ph.D., Institute for Computational and Mathematical Engineering, Stanford University, 2008

Work: CEO, Ayasdi, 2001-present; research scientist, Stanford University, 2008; intern, Google, 2005; software design engineer, Texas Instruments, 2002-2003.

Residence: Palo Alto, Calif.

Family: Married, one child, another on the way. Spouse Riti Sarangi is a researcher at Stanford Linear Accelerator Center.

Source: San Jose Mercury News

MENLO PARK, Calif. — Gurjeet Singh is chief executive and co-founder of a company that reflects an emerging trend in Silicon Valley — the intersection of computer science and mathematics to tackle real-world problems.

A native of India, he earned his Ph.D. in computational mathematics from Stanford University’s Computational and Mathematical Engineering Institute, which prides itself on doing “big math.”

Building on research in data analysis by his adviser, Stanford math professor Gunnar Carlsson, Singh developed “Mapper,” a kind of software robot that automatically ferrets out interesting information from complex masses of data.

Sensing a potentially vast opportunity for use in the real world, Singh secured venture funding and, with Carlsson and Stanford co-researcher Harlan Sexton, formed Ayasdi in 2008.

The pioneering technique has been applied to problems as diverse as fraud detection and traumatic brain injury in football players.

Ayasdi’s stated goal is to make the analysis of complex data “as simple as online shopping or searching the Web.” The company’s name is the Cherokee word for “seek.”

Q: One of your co-founders is a Stanford math professor, you have a degree in computational mathematics and your other co-founder is a Stanford math grad as well. We hear about startups from Stanford’s computer science department all the time, but isn’t it unusual for three mathematicians to start a company?

A: The attitude in a pure mathematics department is very ivory-towerish, in general, so it is extremely unusual for people to start companies out of the math department. The attitude toward going to the “dark side” is very strong.

Q: You all seem to have weathered that well enough. So let’s talk about mining big data, the reason a company like yours exists. What are the burning issues in big-data analysis?

A: The term “big-data analysis” is a very weird term. It acknowledges one problem we have with data, which is that it is very big. But that’s not the main problem we have with data. The problem we have is complexity. People believe the best way to learn from the data is to have a hypothesis and then go check it, but the data is so complex that someone who is working with a data set will not know the most significant things to ask. That’s a huge problem.

Q: The solution?

A: Technologies like Ayasdi’s exist now to automatically discover information from data without having someone making guesses up front.

Q: So you just send it off, let it prowl around the data for a while and then come back and tell you what it found?

A: Correct.

Q: How do you know they are significant?

A: Because we do statistical tests. We run statistical validity tests on everything we find, so we are actually able to guarantee that whatever we find is present in the data.

Q: Give an example of where your company put a computer in charge of ferreting out interesting information and bringing it back for you to interpret.

A: Everything we do in this company is of that nature. One of my favorite things, one of first things we did at Ayasdi, was a study of a very old breast-cancer data set. A decade and a half ago, the Netherlands Cancer Institute collected genetic samples from breast cancer tumors from a few hundred people. They believed that if they analyzed that data they would be able to discover types of breast cancer and they might be able to discover treatments for these types of breast cancers that they just discovered.

By and large it came true. But there was a population with a certain type of breast cancer that clinicians saw in the field, but that researchers could never pull out. They spent a decade and half trying to do it. We threw it into our software and within minutes we were able to discover it without looking for it. It just popped out. Think about it. A few minutes, versus a decade and a half. We published it in Nature. There are so many examples like that.

Q: Another example?

A: Mount Sinai Hospital in New York is one of our customers. They collected roughly 20,000 genetic samples from people with Type 2 diabetes along with their clinical histories. They wanted to figure out, is Type 2 diabetes a disease or is it a symptom?

Are there underlying molecular diseases that display the same outward symptoms but are actually distinct, and thus require very different treatment regimes? In fact, using our Topological Data Analysis system, they were able to discover multiple types of Type 2 diabetes. That obviously has a huge impact on all the hundreds of millions of people who have Type 2 diabetes in the world because they don’t actually have Type 2 diabetes. They have Type 2 diabetes, type 1, or Type 2 diabetes, type 6.

Q: What’s good about that?

A: If you know people with Type 2 diabetes, there’s a high likelihood they will have different medication regimes and different lifestyle options.

Q: In ceding the search to computers, aren’t you losing control of your research?

A: The point is not to cede control, go home and the computer will do your work for you. The point is you will be able to do much more work and you will be much more productive than you have ever been in the past. There is a vast under-appreciation of what machines and algorithms are capable of today. We certainly have the means to change the fundamental way we do things in society. That’s the stuff I’m most excited about.