When Niketan Pansare (pronounced “nick-EE-tan pan-sar-EE”) arrived in the United States, it was to work on his masters at the University of Florida. There, he took two extremely challenging and highly-recommended UF courses –Indexing Large Databases and Database Implementation– both taught by Professor Chris Jermaine. Near the end of the Database Implementation course, Jermaine called Pansare into his office and asked if he’d be interested in working on a PhD. Pansare said, “I had just gotten a dream job offer from Microsoft, and not even thought of PhD until then. He gave me a week to think about it.”
Over the next week, Pansare spoke to a lot of UF students pursuing PhDs under Jermaine’s supervision and learned that he was one of the best advisers to work with in the area of database research. Jermaine primarily works on problems in the intersection of applied statistics and systems-building. “Collaborating with Chris as my adviser was a proposal I could not resist, and I decided to go for it,” he said. But before Pansare applied to the PhD program in Florida, Jermaine was offered a professorship at Rice. At that time, Jermaine suggested that Pansare consider applying for the PhD program at Rice, but offered to write him a recommendation letter if he chose to go to a different university.
“I did a little more research, and attended Rice’s prospective graduate student orientation,” Pansare said. His interactions with the Rice professors so impressed Pansare that he did not apply to any other PhD program. “Chris and my orientation trip sold me on Rice,” he said. But he was also influenced by the high availability of the faculty. He said Rice’s low student-teacher ratio doesn’t get a lot acknowledgement, but his recent experience at a big public school had reinforced the importance of easy access to faculty advisers. For Pansare, one aspect of the low student-teacher ratio meant plenty of individual time with Jermaine. “Chris would give me something to read and then we’d meet to discuss it every day or every other day. It was like getting an additional, mini-course on machine learning and statistics in a one-on-one setting.”
Pansare also absorbed important lessons from other Rice professors. He said, “I had just arrived at Rice and took a class on Distributed Systems with Prof Alan Cox. In that class, in addition to the technical aspect of the material, Prof Cox worked closely with the students to improve their presentation skills. One of his key pieces of advice that I try to incorporate in my presentations is to ‘hook your audience with quick motivation’ rather than recite the outline.”
He learned the importance of product validation and rapid prototyping in a Rice MBA course. Pansare said, “Validating early and often is as important in research as in startups; it is probably one of the most important lessons of my grad school experience and I learned it the hard way.”
Connecting CS problems with ideas from other disciplines resulted in the publication of his second paper, one of the high points of his Rice program. Jermaine helped him identify and study a new research direction, exploring the application of reward theory – popular in a small branch of statistics –to improve the processing time for Large MapReduce jobs using online aggregation. To explain the challenge, Pansare cited a typical problem – find an average salary for a group of employees. “If you are searching a really large database for an average salary, it might take a couple of days on a large computer,” he said. “Let’s say the average is $80K. After submitting his job, the user just waits for the query to process over that time. But what if you could make an educated guess as soon as you start getting data?”
For example, Pansare said, the user might be able to make an early forecast that the answer would fall in a range of $70–90K, with 90% probability of accuracy. As more data flows in, the forecast could narrow its range, perhaps between $79-81K with 95% probability. The benefits to a “good enough” answer are two-fold; the user gets an acceptable answer more quickly and saves the expense of continuing to run the query. “If you are running your MapReduce jobs in Cloud, stopping early saves you money,” said Pansare.
He remains excited about solving problems at IBM Research-Almaden in San Jose, CA. He said, “[Almaden] is the premier database research group, one of the few remaining labs, and it was the birthplace of database research. The research papers I looked up as a grad student were written by the people I now work with, and the main reason I applied to Almaden.”
In addition to his association with the Almaden Research Center, Pansare still collaborates with his adviser at Rice as well as Jermaine’s current PhD students, like Jacob Gao. Pansare has also accepted an invitation to give a talk to CS students on SystemML, machine learning software developed by researchers at his lab, and he will continue fostering collaborations between IBM Almaden Research Center and Rice University’s CS PhD graduates.