g1879 / DrissionPage
- понедельник, 22 июля 2024 г. в 00:00:01
基于python的网页自动化工具。既能控制浏览器,也能收发数据包。可兼顾浏览器自动化的便利性和requests的高效率。功能强大,内置无数人性化设计和便捷功能。语法简洁而优雅,代码量少。
How to use: Documents
This project is mainly updated in gitee, and will be submitted to GitHub after producing a stable version. Check out the latest developments at gitee.
DrissionPage is a python-based web page automation tool. It can control the browser, send and receive data packets, and combine the two into one. It can take into account the convenience of browser automation and the high efficiency of requests. It is powerful and has countless built-in user-friendly designs and convenient functions. Its syntax is concise and elegant, the amount of code is small, and it is friendly to novices.
Your star is the greatest support for me.💖
CapSolver is an AI-powered service that specializes in solving various types of captchas automatically, empowers data collection by helping developers easily overcome the captcha challenges encountered during Web Scraping. It supports captchas such as reCAPTCHA V2, reCAPTCHA V3, hCaptcha, FunCaptcha, DataDome, AWS Captcha, Geetest, and Cloudflare turnstile among others. For developers, Capsolver offers API integration options detailed in documentation, facilitating the integration of captcha solving into applications. They also provide browser extensions for Chrome and Firefox, making it easy to use their service directly within a browser. Different pricing packages are available to accommodate varying needs, ensuring flexibility for users.
Watch ads that support open source authors, thx.
If this project is helpful to you, why not buy the author a cup of coffee :)
When using requests for data collection, when facing a website to log in to, you have to analyze data packets and JS source code, construct complex requests, and often have to deal with anti-crawling methods such as verification codes, JS obfuscation, and signature parameters. The threshold is high and the development efficiency is low. high. Using a browser can largely bypass these pitfalls, but the browser is not very efficient.
Therefore, the original intention of this library is to combine them into one and achieve "fast writing" and "fast running" at the same time. It can switch the corresponding mode when different needs are needed, and provide a humanized usage method to improve development and operation efficiency. In addition to merging the two, this library also encapsulates commonly used functions in web page units, providing very simple operations and statements, allowing users to reduce considerations of details and focus on function implementation. Implement powerful functions in a simple way and make your code more elegant.
The previous version was implemented by repackaging selenium. Starting from 3.0, the author started from scratch, redeveloped the bottom layer, got rid of the dependence on selenium, enhanced functions, and improved operating efficiency.
Simple yet powerful!
After long-term practice, the author has stepped through countless pitfalls, and all the experiences he has summarized have been written down in this library.
This library uses a fully self-developed kernel, has built-in N number of practical functions, and has integrated and optimized common functions. Compared with selenium, it has the following advantages:
<iframe>
without switching in and out<iframe>
as a normal element. After obtaining it, you can directly search for elements in it, making the logic clearer.In addition to the above advantages, this library also has numerous built-in humanized designs.
Please do not apply DrissionPage to any work that may violate legal regulations and moral constraints. Please use DrissionPage in a friendly manner, comply with the spider agreement, and do not use DrissionPage for any illegal purposes. If you choose to use DrissionPage This means that you abide by this agreement. The author does not bear any legal risks and losses caused by your violation of this agreement. You will be responsible for all consequences.