GitHub - DevAlone/proxy_py: Proxy collector
proxy_py README
proxy_py is a program which collects proxies, saves them in a database and makes periodically checks. It has a server for getting proxies with nice API(see below).
Where is the documentation?
It's here -> https://proxy-py.readthedocs.io
How to support this project?
You can donate here -> https://www.patreon.com/join/2313433
Thank you :)
How to install?
There is a prepared docker image.
1 Install docker and docker compose. If you're using ubuntu:
sudo apt install docker.io docker-compose
2 Download docker compose config:
wget "https://raw.githubusercontent.com/DevAlone/proxy_py/master/docker-compose.yml"2 Create a container
3 Run
It will give you a server on address localhost:55555
To see running containers use
To stop proxy_py use
How to get proxies?
proxy_py has a server, based on aiohttp, which is listening 127.0.0.1:55555 (you can change it in the settings file) and provides proxies. To get proxies you should send the following json request on address http://127.0.0.1:55555/api/v1/ (or other domain if behind reverse proxy):
{
"model": "proxy",
"method": "get",
"order_by": "response_time, uptime"
}Note: order_by makes the result sorted by one or more fields(separated by comma). You can skip it. The required fields are model and method.
It's gonna return you the json response like this:
{
"count": 1,
"data": [{
"address": "http://127.0.0.1:8080",
"auth_data": "",
"bad_proxy": false,
"domain": "127.0.0.1",
"last_check_time": 1509466165,
"number_of_bad_checks": 0,
"port": 8080,
"protocol": "http",
"response_time": 461691,
"uptime": 1509460949
}],
"has_more": false,
"status": "ok",
"status_code": 200
}Note: All fields except protocol, domain, port, auth_data, checking_period and address CAN be null
Or error if something went wrong:
{
"error_message": "You should specify \"model\"",
"status": "error",
"status_code": 400
}Note: status_code is also duplicated in HTTP status code
Example using curl:
curl -X POST http://127.0.0.1:55555/api/v1/ -H "Content-Type: application/json" --data '{"model": "proxy", "method": "get"}'
Example using httpie:
http POST http://127.0.0.1:55555/api/v1/ model=proxy method=get
Example using python's requests library:
import requests import json def get_proxies(): result = [] json_data = { "model": "proxy", "method": "get", } url = "http://127.0.0.1:55555/api/v1/" response = requests.post(url, json=json_data) if response.status_code == 200: response = json.loads(response.text) for proxy in response["data"]: result.append(proxy["address"]) else: # check error here pass return result
Example using aiohttp library:
import aiohttp async def get_proxies(): result = [] json_data = { "model": "proxy", "method": "get", } url = "http://127.0.0.1:55555/api/v1/" async with aiohttp.ClientSession() as session: async with session.post(url, json=json_data) as response: if response.status == 200: response = json.loads(await response.text()) for proxy in response["data"]: result.append(proxy["address"]) else: # check error here pass return result
How to interact with API?
Read more about API here -> https://proxy-py.readthedocs.io/en/latest/api_v1_overview.html
# TODO: add readme about API v2
What about WEB interface?
There is lib.ru inspired web interface which consists of these pages(with slash at the end):
- http://localhost:55555/i/get/proxy/
- http://localhost:55555/i/get/proxy_count_item/
- http://localhost:55555/i/get/number_of_proxies_to_process/
- http://localhost:55555/i/get/collector_state/
How to contribute?
Just fork, make your changes(implement new collector, fix a bug or whatever you want) and create pull request.
Here are some useful guides:
How to test it?
If you've made changes to the code and want to check that you didn't break anything, just run
inside virtual environment in proxy_py project directory.
How to use custom checkers/collectors?
If you wan't to collect proxies from your source or you need proxies to work with particular site, you can write your own collectors or/and checkers.
- Create your checkers/collectors in current directory following the next directory structure:
// TOOD: add more detailed readme about it
local/ ├── requirements.txt ├── checkers │ └── custom_checker.py └── collectors └── custom_collector.py
You can create only checker or collector if you want so
- Create proxy_py/settings.py in current dir with the following content
from ._settings import * from local.checkers.custom_checker import CustomChecker PROXY_CHECKERS = [CustomChecker] COLLECTORS_DIRS = ['local/collectors']
you can append your checker to PROXY_CHECKERS or COLLECTORS_DIRS instead of overriding to use built in ones as well, it's just normal python file. See proxy_py/_settings.py for more detailed instructions on options.
- Follow the steps in "How to install?" but download this docker-compose config instead
wget "https://raw.githubusercontent.com/DevAlone/proxy_py/master/docker-compose-with-local.yml"and run with command
docker-compose -f docker-compose-with-local.yml up
- ...?
- Profit!
How to build from scratch?
- Clone this repository
git clone https://github.com/DevAlone/proxy_py.git
- Install requirements
cd proxy_py
pip3 install -r requirements.txt- Create settings file
cp config_examples/settings.py proxy_py/settings.py
- Install postgresql and change database configuration in settings.py file
- (Optional) Configure alembic
- Run your application
- Enjoy!