Find Jobs
Hire Freelancers

Website crawler and natural language processor

$1500-3000 USD

Closed
Posted 11 months ago

$1500-3000 USD

Paid on delivery
Hi all , We're seeking a skilled technical vendor, developer/Data scientist (person or team) with skills with in web technology, natural language processing and machine learning to create a website crawler that's able to accomplish a specific set of tasks. Below, you'll find specifications for the project. Please review them and provide us with a quote for your services and which technology / services you suggest. Project Requirements 1. Website Crawler: We need a website crawler that can crawl an entire website based on an inputted domain name. This crawler should be able to navigate and extract information from all the pages on the website, including both HTML pages and PDFs. 2. Word Identification: The system should be able to identify specific words used on the website. We will provide a list of approximately 100 words. The system should be able to generate a list of URLs where one or more of these words are used. Each URL should be given a score based on the frequency and variety of these words on the page. 3. Contextual Information: Every time a word from the list is found, the system should display the word in context. This means displaying a text snippet with 100 characters before and after the word. 4. Data Analysis: The system should provide an overall analysis of how many times words from our list are used across the entire website. Additionally, it should display what percentage of the total words on the website come from our list. 5. Data Storage and Comparison: All results must be saved per website. However the full crawled content does not need to be saved. Just processed. The system should allow us to re-run the crawl and compare the results with those of the previous crawl. 6. Machine Learning Application: Optimally, the system should utilize machine learning. This will allow an admin user to categorize a found word and its context as either "problematic" or "unproblematic". Over time, the system should improve its ability to score these word-context pairs, rating them on a scale from 1-10 based on their "likeliness to be problematic". Deliverables Initially, we're looking for a technical "proof of concept". It's acceptable for the initial version to have some workarounds or minor limitations. We looking for a solution that utilized existing frameworks to services in order to save time and increase quality. Could be services such as OpenAI, Huggingface or other relevant services If the proof of concept is successful, we would like the option to further develop the tool into a more polished product. Please include your estimate for the proof of concept as well as any thoughts you have on potential future development in your quote. Looking forward to hearing from you. Skills and experience required: - Experience in web crawling and data scraping - Knowledge of natural language processing and text analysis - Proficiency in programming languages such as Python and Java, HTML, CSS, JS Best Regards, Henry
Project ID: 36582014

About the project

63 proposals
Remote project
Active 9 mos ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
63 freelancers are bidding on average $2,519 USD for this job
User Avatar
Hi Henry. Experienced Web/ML/AI/OpenAI/Huggingface developer here with 5+ years of extensive hands-on experience in Full Stack development. I am happy to mention that I rank amongst top 2% among 58+ million freelancers. I want to ask the following questions relevant to your project. Questions: 1- "extract information from all the pages on the website". Can you please confirm this information? 2- "identify specific words used". What will be these specific words? 3- What sort of administrative features do you require? 4- Will this product be SaaS-based? 5- Can you provide me with the complete scope of this project? Happy to discuss all the functional and non-functional requirements over the chat. Also, would like to share my relevant portfolio/experience as well. Kindly initiate the chat to discuss each aspect of this project thoroughly, would feel pleasure to help you in any way. Regards: Nayab
$3,000 USD in 60 days
5.0 (69 reviews)
8.4
8.4
User Avatar
Hey Good morning , Just finished reading the brief details . I see you have been looking for someone who has experience with these tech stacks Web Scraping, Machine Learning (ML), JavaScript, Data Processing and Python. I will request you to review my profile, skills, projects and customer feedbacks to confirm that I will be good fit for this job. I would like you to start the chat so we can discuss the project in detail and we will see how it goes. Questions: 1. These are all the requirements? If not, Please share more detailed requirements. 2. Do you currently have anything done for the job or it has to be done from scratch? 3. What is the timeline to get this done? Why Choose Me? 1. I have done more than 250 major projects only on freelancer.com. 2. I have not received a single bad feedback since last 5-6 years. 3. You will find 5 star feedback on last 100+ major projects which shows my clients are happy with my work. Portfolio: https://www.freelancer.com/u/AwaisChaudhry Timings: 9am - 9pm Eastern Time (I work as a full time freelancer) Thanks and regards, Awais
$3,000 USD in 21 days
4.9 (81 reviews)
8.3
8.3
User Avatar
Hi Henry, My name is George and I'm an experienced web crawler expert. I believe I'm the perfect fit for this job. I have the necessary skills and have experience in both web crawling and data scraping. As well as a great knowledge of natural language processing and text analysis. I'm also proficient in various programming languages like Python, Java, HTML, CSS and JS. In addition, I already have a custom web scrapping tool which I've used on a similar project before. To provide you with the best possible results, I suggest the usage of existing frameworks and services such as OpenAI or HuggingFace. This will have both a time-saving and a quality-increasing impact. In order to guarantee a successful proof of concept, I'd like to ask the following questions: 1. What language would you prefer for the project's code? 2. Are there any specific restrictions for this project? 3. What kind of data do you want to collect from the website? 4. Does the system need to be able to compare different versions of the same site? 5. What type of result are you expecting for the machine learning application? I'm enthusiastic about working on this project and am ready to provide you with all the skills and knowledge needed to ensure its success. Best regards, George
$3,000 USD in 3 days
5.0 (157 reviews)
8.0
8.0
User Avatar
Hello Sir, I read the description and understood your project details. You are looking for Website crawler and natural language processor I have 8+ years of expertise in WordPress, PHP/MySQL, HTML 5, Responsive Design, Ajax, J Query, and CSS. I develop many websites using WordPress like e-Commerce, Real estate, blog, Jobs Portal, Hotel & Flight Booking, and many others. I will appreciate if you consider me further for this job opportunity and we can discuss task in details please ping me any time Thanks
$2,250 USD in 7 days
4.9 (162 reviews)
7.6
7.6
User Avatar
Hello! My name is Umar and I'm a qualified freelancer with 10 years of experience in IT specifically in Web Scraping. I understand that you're seeking a skilled technical vendor to create a website crawler that's able to accomplish a specific set of tasks. We believe we are the perfect fit for this project because of our extensive experience in web technology and natural language processing. Additionally, we're able to provide data analysis of how many times words from our list are used across the entire website as well as displaying what percentage of the total words on the website come from our list. I am looking forward to working with you. Thank you-
$2,000 USD in 4 days
4.9 (269 reviews)
7.3
7.3
User Avatar
Dear Sir, I am excited to apply for the technical vendor, developer/Data scientist role for your website crawler project. I have experience in web crawling, data scraping, and text analysis and am proficient in programming languages such as Python, Java, HTML, CSS, and JS. I am confident in my ability to deliver a high-quality solution that meets your specifications. I will create a website crawler that is able to crawl an entire website based on an inputted domain name, extract information from all the pages on the website, and identify specific words used on the website. The system will generate a list of URLs where one or more of these words are used and provide an analysis of how many times words from your list are used across the entire website. For the initial version, I will provide a technical proof of concept that may have minor limitations or workarounds. I will utilize existing frameworks and services such as OpenAI and Huggingface to save time and increase the quality of the final product. Please let me know if you have any questions or concerns regarding my application or my approach to the project. I am looking forward to hearing from you. Thank you for your time and consideration. Sincerely, Smith
$2,250 USD in 7 days
5.0 (52 reviews)
6.8
6.8
User Avatar
Hi henry, Based on the project requirements, I would suggest the following technologies and services: Python for web crawling and data scraping Natural Language Processing (NLP) libraries such as NLTK or spaCy for word identification and contextual information Machine Learning libraries such as scikit-learn for developing a model that rates word-context pairs AWS or Google Cloud Platform for cloud hosting and storage Elasticsearch or a similar search engine for indexing and searching the crawled content Django or Flask for developing a web application to display the results and provide an interface for categorizing word-context pairs For the proof of concept, I estimate that the project could take anywhere from 2-4 weeks, depending on the complexity of the website being crawled and the quality of the input data. The cost would depend on the hourly rate of the developer or team working on the project, which varies depending on location and experience level. For potential future development, I would suggest further developing the machine learning model to improve its accuracy in rating word-context pairs. Additionally, it may be beneficial to add functionality to allow for more advanced search queries and filtering options. Best regards, Esha Nawal
$2,200 USD in 10 days
4.9 (43 reviews)
7.1
7.1
User Avatar
Hi, I hope you are doing fine. I have almost 10 years of experience in machine learning algorithms. I can implement various types of artificial intelligence algorithms including yours with Matlab, Python and etc. I have PhD from Tohoku University and have several journal publications on the subjects. You can see portfolio for my previous projects. I read about your project and am interested in working with you. Please send me a message so that we can discuss more. Best regards.
$2,250 USD in 7 days
5.0 (41 reviews)
6.7
6.7
User Avatar
Dear Henry, Thank you for considering our services for the development of a website crawler and natural language processor. Our proposed solution aims to meet your project requirements effectively. Approach and Methodology: a. Website Crawler: We will develop a web crawler capable of exploring entire websites based on provided domain names. b. Word Identification: Using natural language processing techniques, our solution will identify specific words from a provided list. c. Contextual Information: To provide context, our system will display the identified words in text snippets consisting of 100 characters before and after each occurrence. d. Data Analysis: Our solution will perform an in-depth analysis of word usage across the entire website, e. Data Storage and Comparison: . Although the full crawled content will not be saved, the system will enable rerunning the crawl and comparing the results with previous iterations. f. Machine Learning Application: This will enable an admin user to categorize word-context pairs as "problematic" or "unproblematic. Technology Stack: Web crawling and data scraping: Python (Scrapy, BeautifulSoup) Natural language processing and text analysis: Python (NLTK, spaCy) Machine learning: Python (scikit-learn, TensorFlow) Data storage: Relational or NoSQL database (MySQL, MongoDB) Let's get connected on chat or have a quick Call to discuss this further in detail. Looking forward to hearing from you soon, regards, Manoj
$3,000 USD in 40 days
5.0 (7 reviews)
6.8
6.8
User Avatar
Hi, I saw that you need help with Web Scraping, Machine Learning (ML), Data Processing, Python and JavaScript. I have 6 years of experience working on these frameworks. I believe I can help you with it. I would request you to have a look at my porfolio, customer feedbacks from my profile. If you find me worthy of doing your job. Please start the chat and lets discuss it. Regards Shamshad
$3,000 USD in 15 days
5.0 (40 reviews)
6.5
6.5
User Avatar
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
$2,250 USD in 7 days
5.0 (45 reviews)
6.5
6.5
User Avatar
Hi, Henry. Thank you for your job posting with details. As a senior developer with over 10 years of experience, I look forward to producing the outcome you require. I am familiar with Python, HTML/CSS/SCSS, JavaScript, and Mysql/PostgreSQL/MongoDB. I think it would be great to use Python/Django/Selenium and AI to scrap the data and processing them. I will have no trouble with integrating the data processing and analytics. Please contact me. Best Regards, Imad
$2,750 USD in 21 days
4.9 (16 reviews)
6.1
6.1
User Avatar
Hi Henry, I am sure that I can do this job. I artificial intelligence engineer experienced in Software Architecture. I have accomplished many projects like yours, in my portfolio you can find samples. I will attend meetings with you to illustrate everything about codes & methodologies. I am ready to start now with you. Feel free to contact me for further details because I am looking forward working with you. Thanks
$2,250 USD in 7 days
4.9 (32 reviews)
5.7
5.7
User Avatar
Hi there, My name is Umair. I have good experience with Web Scraping, JavaScript, Data Processing, Machine Learning (ML) and Python. I am a practicing Developer/Designer Since 2015. I can perfectly work on this project regarding Web Scraping, JavaScript, Data Processing, Machine Learning (ML) and Python. Based on my experience, I can do this task for you and the quality of work would be up to the mark. However, further discussions are required for more clarity. I will wait for your text to discuss the project in further detail. Thanks & Regards Umair A.
$2,700 USD in 20 days
5.0 (9 reviews)
5.4
5.4
User Avatar
Hello HenryGKDK, We would like to grab this opportunity and will work till you get 100% satisfied with our work. We are an expert team which have many years of experience on Web Scraping Lets connect in chat so that We discuss further. Thank You
$3,000 USD in 7 days
5.0 (9 reviews)
5.4
5.4
User Avatar
Nice to meet you HenryGKDK, My name is Anthony Muñoz, I express my interest in working on your project after carefully reading the requirements and concluding that they match my area of knowledge and skills. I am currently the lead engineer for the IT agency DSPro and I have more than 10 years of experience in the field. I have successfully completed a large number of similar jobs and I consider your project to be a challenge in which I would like to work and be able to make it a reality. Please feel free to contact me, it will be my pleasure to help you. I greatly appreciate the time provided and I remain attentive to any questions or concerns. Greetings
$2,439 USD in 7 days
4.7 (6 reviews)
5.8
5.8
User Avatar
Hello there! My name is Bharat lal and I'm the founder of Object Square, a blockchain development company. I understand that you're seeking a skilled technical vendor to create a website crawler that's able to accomplish a specific set of tasks. We think we are the perfect fit for this project because of our extensive experience in web crawling and data scraping as well as natural language processing and text analysis. Additionally, we have the expertise required to develop a website crawler that utilizes existing frameworks to reduce time and increase quality. We believe that our combination of skills, experience and commitment make us an excellent choice for this job. As part of our commitment to providing top quality service at all times, we would offer comprehensive quality checks & testing during the development process as well as documentation and suggestions for improvements so that your website crawler can be optimized for performance without compromising on user experience. In addition we offer support and maintenance with quick turn around time so that your needs are always being taken care of.
$2,850 USD in 20 days
5.0 (6 reviews)
5.0
5.0
User Avatar
Hi, I can help u as i have done several similar jobs related to Python, JavaScript, Data Processing, Web Scraping and Machine Learning (ML), I have read the details and furthermore discuss about it, plz discuss with me in detail. Regards
$3,000 USD in 8 days
5.0 (3 reviews)
4.9
4.9
User Avatar
Greetings, I read through the job details extremely carefully and I am absolutely sure that I can do the project very well. Given my work experience in JavaScript, Python, Data Processing, Web Scraping, Machine Learning (ML) development, I assure you that this task is very much within my domain! This is a tentative bid for now as I would require some more detail to evaluate the budget and time frame more accurately. I am available 24/7 so you can contact me anytime round the clock to discuss the project. Best Regards, Danny
$2,900 USD in 7 days
5.0 (5 reviews)
4.9
4.9
User Avatar
Hello there! My name is Vipin and I'm the founder of Codemeg Soft Solutions. We specialize in web development, mobile app development, and data analysis. We understand your need for a website crawler that can crawl an entire website based on an inputted domain name. This crawler should be able to navigate and extract information from all the pages on the website, including both HTML pages and PDFs. Our team has extensive experience in natural language processing and text analysis which will be necessary to complete your project requirements. Additionally, our team is proficient in programming languages such as Python and Java which will enable us to build a robust website crawler solution. We're confident that our combination of skills, experience, and commitment make us the perfect choice for this project. Please feel free to reach out to us if you have any questions or would like more information about our services or portfolio.
$1,500 USD in 15 days
5.0 (5 reviews)
5.0
5.0

About the client

Flag of DENMARK
Copenhagen, Denmark
0.0
0
Member since May 15, 2023

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.