Senior Backend Engineer at Scrapinghub

javapythonscala
Published a month ago

Scrapinghub is looking for a Senior Backend Engineer to develop and grow a new web crawling and extraction SaaS.


The new SaaS will include our recently released AutoExtract which provides an API for automated e-commerce and article extraction from web pages using Machine Learning. AutoExtract is a distributed application written in Java, Scala and Python; components communicate via Apache Kafka and HTTP, and orchestrated using Kubernetes.


You will be designing and implementing distributed systems: large-scale web crawling platform, integrating Deep Learning based web data extraction components, working on queue algorithms, large datasets, creating a development platform for other company departments, etc. - this is going to be a challenging journey for any backend engineer!


As a Senior Backend Engineer, you will have a large impact on the system we’re building, the new SaaS is still in the early stages of development.



Job Responsibilities:



  • Work on the core platform: develop and troubleshoot Kafka-based distributed application, write and change components implemented in Java, Scala and Python.

  • Work on new features, including design and implementation. You should be able to own and be responsible for the complete lifecycle of your features and code.

  • Solve distributed systems problems, such as scalability, transparency, failure handling, security, multi-tenancy.





Requirements




  • 3+ years of experience building large scale data processing systems or high load services

  • Strong background in algorithms and data structures.

  • Strong track record in at least two of these technologies: Java, Scala, Python, C++. 3+ years of experience with at least one of them.

  • Experience working with Linux and Docker.



  • Good communication skills in English.

  • Computer Science or other engineering degree.



Bonus points for:



  • Kubernetes experience

  • Apache Kafka experience

  • Experience building event-driven architectures

  • Understanding of web browser internals

  • Good knowledge of at least one RDBMS.

  • Knowledge of today’s cloud provider offerings: GCP, Amazon AWS, etc.

  • Web data extraction experience: web crawling, web scraping.

  • Experience with web data processing tasks: finding similar items, mining data streams, link analysis, etc.

  • History of open source contributions

Apply
👉 Please reference you found the job on Remote People as thank you to us, this helps us get more companies to post here!
When applying for jobs, you should NEVER have to pay to apply.

Are you hiring ?

Post a Job Offer

And reach thousands of people that are willing to work for you

POST A JOB OFFER
now
FREE