Principal Data Scientist – £100,000 – £120,000 + London – Java / Scala, Matlab, R, SQL, Hadoop, Pig
My client is a Big Data consultancy start up; they are seeking a Senior Data Scientist to join their company.
The Principal Data Scientist will:
- Design hypothesis tests, oversee test execution, and evaluate the results
- Model, predict and classify data
- Utilise machine learning and large-scale data mining techniques to discover and identify
- actionable patterns in the data
- Help define and document business requirements and acceptance criteria
- Assist in or lead workshops and documenting relevant outcomes
- Identify opportunities and appropriate solutions (e.g. algorithms and libraries)
- Present to both technical and non-technical stakeholders, internally and in a Customer
- Agile cross-functional teamwork:
- Contribute to sprint planning, provide realistic estimates and plan deliverables
- Attend stand-ups and retrospectives
- Research, design, evaluate, build, tune and document end to end data science solutions
- Understand and solve scalability and production issues
- Documentation and coding standards:
- Adhere to coding standards and best practices at BDP
- Ensure all models are validated and all business logic is robustly tested Skills/Experience:
- Core programming, text file manipulation, and statistics with Numpy, Pandas, Scikit or
- other approved modules
- Data frames, data manipulation, and objects
- Command line, pipes, and remote terminals
- Push and pull versions and code brands from approved version control system at
- Writing high performance code in Java, Scala, C/C++, Fortran to be called by Python or R
- Strong in additional statistics and scientific tools such as SAS, SPSS, Matlab, etc.
- Loading & parsing data in Spark. Use SQL context in Spark. GraphX proficient. Develop
- Models leveraging Spark (ML or MLLib)
- Exporting, importing, aggregating, and filtering data in one of the relationship stores:
- SQL, Hive, Pig, or approved other technology.
- Cleaning, manipulating, and formatting data stored in all of these non-relational stores:
- flat files and RESTful APIs.
- Writing jobs to read, filter, manipulate, and aggregate data stored in Hadoop with one of
- the APIs: Spark, Java MR, Hadoop Streaming w/ Python, or approved other API.
- Writing UDFs for Hive, Pig, or approved other technology using Java, Python, or
- approved other language.
- Storing, extracting, and querying objects from one of these key-value oriented data
- stores: Cassandra, Mongo, or approved other technology.
- Generating data profiles including measures of central tendency, measures of deviationThis is a great opportunity for someone to join a really exciting company
Please contact firstname.lastname@example.org for more information or feel free to call me on 0207 928 2525
Job Reference: Data-scientistLondo
Salary: £100000 - £120000 per annum + Benefits
Job Start Date: ASAP - 1 Month