Data Scientist

Location: Chennai and Bangalore

Experience: 2 to 7 years

Key Responsibilities:

Developing “core” data science models and capabilities - that power the Near Ambient Intelligence Platform and associated products.

Advanced data analytics include processing structured (payments, telecom, page clicks etc) and unstructured data in multiple formats (text, audio, video) spanning multiple domains including user profile data, geospatial data, network data and retail data.

Partner with technology and the business team to build a superior data quality pipeline that will feed the models.

Research and create intellectual property for the company that will benefit Near and its partners.

Use nonparametric and probabilistic models to generate insights keeping in mind the bias-variance trade-off.

Working closely with the Engineering team to “operationalize” and deploy the models.

Mentor/share knowledge of data science with other global members of the Near, document and partner with others as a team to deliver the maximum value for the company.

Understand and prioritize data science work based on cost-effectiveness and leveraging time management skills.

Attend conferences and organize workshops/meet-ups to be in touch with the data science community

Skills/Experience:

Must have a minimum of 2-7 years of industry experience in developing data science models

Must have completed academic projects in data science experimenting with raw data and generating insights, publications are a plus.

Must have thorough mathematical knowledge of correlation/causation, decision trees, classification and regression models, recommenders, probability and stochastic processes, distributions, priors and posteriors.

Skilled at scientific programming languages such as Python, Java, R, Matlab, Clojure and writing deployable code into production.

Understand the model lifecycle of cleansing/standardizing raw data, feature creation/selection, writing complex transformation logic to generate independent and dependent variables, model selection, tuning, A/B testing and generating production-ready code.

Knowledge of Numerical optimization, Linear/Non-linear/Integer programming, Statistics, Combinatorial optimization is a plus.

Familiarity with R, Apache Spark (Java, Scala, Python), PyMC3/theano/tensorflow and other scientific python/R modules is a plus.

Need to be comfortable writing code for model building and bootstrap, test and own models through their lifecycle including devops and deploying into cloud.

Job Requirements:

We are looking for a data scientist with a Master’s Degree, PhD is preferred.

An ideal candidate must have academic experience and must have published a few research papers.

Overall 2 to 7 years of experience with at least minimum 2 years working experience on any data-driven company/platform.

The candidate is expected to have exceptional problem solving, analytical and organisation skills with a detail-oriented attitude.

Passion for learning new technologies and be up-to-date with the scientific research community