torequests package¶
torequests.main¶
-
class
torequests.main.Pool(n=None, timeout=None, default_callback=None, catch_exception=True, *args, **kwargs)[source]¶ Let ThreadPoolExecutor use NewFuture instead of origin concurrent.futures.Future.
WARNING: NewFutures in Pool will not block main thread without NewFuture.x.
Basic Usage:
from torequests.main import Pool import time pool = Pool() def use_submit(i): time.sleep(i) result = 'use_submit: %s' % i print(result) return result @pool.async_func def use_decorator(i): time.sleep(i) result = 'use_decorator: %s' % i print(result) return result tasks = [pool.submit(use_submit, i) for i in (2, 1, 0) ] + [use_decorator(i) for i in (2, 1, 0)] # pool.x can be ignore pool.x results = [i.x for i in tasks] print(results) # use_submit: 0 # use_decorator: 0 # use_submit: 1 # use_decorator: 1 # use_submit: 2 # use_decorator: 2 # ['use_submit: 2', 'use_submit: 1', 'use_submit: 0', 'use_decorator: 2', 'use_decorator: 1', 'use_decorator: 0']
-
all_tasks¶ Keep the same api for dummy, return self._all_futures actually
-
catch_exception= None¶ catch_exception=True will not raise exceptions, return object FailureException(exception)
-
default_callback= None¶ set the default_callback if not set single task’s callback
-
-
class
torequests.main.ProcessPool(n=None, timeout=None, default_callback=None, catch_exception=True, *args, **kwargs)[source]¶ Simple ProcessPool covered ProcessPoolExecutor.
from torequests.main import ProcessPool import time pool = ProcessPool() def use_submit(i): time.sleep(i) result = 'use_submit: %s' % i print(result) return result def main(): tasks = [pool.submit(use_submit, i) for i in (2, 1, 0)] # pool.x can be ignore pool.x results = [i.x for i in tasks] print(results) if __name__ == '__main__': main() # ['use_submit: 2', 'use_submit: 1', 'use_submit: 0'] # use_submit: 0 # use_submit: 1 # use_submit: 2
-
class
torequests.main.NewFuture(timeout=None, args=None, kwargs=None, callback=None, catch_exception=True)[source]¶ Add .x attribute and timeout args for original Future class
WARNING: Future thread will not stop running until function finished or pid killed.
Attr cx: blocking until the task finish and return the callback_result. Attr x: blocking until the task finish and return the value as coro returned. Attr task_start_time: timestamp when the task start up. Attr task_end_time: timestamp when the task end up. Attr task_cost_time: seconds of task costs. Parameters: catch_exception – True will catch all exceptions and return as FailureException-
callback_result¶ Block the main thead until future finish, return the future.callback_result.
-
cx¶ Block the main thead until future finish, return the future.callback_result.
-
x¶ Block the main thead until future finish, return the future.result().
-
-
torequests.main.Async(f, n=None, timeout=None)[source]¶ Concise usage for pool.submit.
Basic Usage Asnyc & threads
from torequests.main import Async, threads import time def use_submit(i): time.sleep(i) result = 'use_submit: %s' % i print(result) return result @threads() def use_decorator(i): time.sleep(i) result = 'use_decorator: %s' % i print(result) return result new_use_submit = Async(use_submit) tasks = [new_use_submit(i) for i in (2, 1, 0) ] + [use_decorator(i) for i in (2, 1, 0)] print([type(i) for i in tasks]) results = [i.x for i in tasks] print(results) # use_submit: 0 # use_decorator: 0 # [<class 'torequests.main.NewFuture'>, <class 'torequests.main.NewFuture'>, <class 'torequests.main.NewFuture'>, <class 'torequests.main.NewFuture'>, <class 'torequests.main.NewFuture'>, <class 'torequests.main.NewFuture'>] # use_submit: 1 # use_decorator: 1 # use_submit: 2 # use_decorator: 2 # ['use_submit: 2', 'use_submit: 1', 'use_submit: 0', 'use_decorator: 2', 'use_decorator: 1', 'use_decorator: 0']
-
torequests.main.get_results_generator(future_list, timeout=None, sort_by_completed=False)[source]¶ Return as a generator of tasks order by completed sequence.
-
torequests.main.run_after_async(seconds, func, *args, **kwargs)[source]¶ Run the function after seconds asynchronously.
-
class
torequests.main.tPool(n=None, interval=0, timeout=None, session=None, catch_exception=True, default_callback=None)[source]¶ Async wrapper for requests.
Parameters: - n – thread pool size for concurrent limit.
- interval – time.sleep(interval) after each task finished.
- timeout – timeout for each task.result(timeout). But it will not shutdown the raw funtion.
- session – individually given a available requests.Session instance if necessary.
- catch_exception – True will catch all exceptions and return as
FailureException - default_callback – default_callback for tasks which not set callback param.
Usage:
from torequests.main import tPool from torequests.logs import print_info trequests = tPool(2, 1) test_url = 'http://p.3.cn' ss = [ trequests.get( test_url, retry=2, callback=lambda x: (len(x.content), print_info(len(x.content)))) for i in range(3) ] # or [i.x for i in ss] trequests.x ss = [i.cx for i in ss] print_info(ss) # [2020-02-11 11:36:33] temp_code.py(10): 612 # [2020-02-11 11:36:33] temp_code.py(10): 612 # [2020-02-11 11:36:34] temp_code.py(10): 612 # [2020-02-11 11:36:34] temp_code.py(16): [(612, None), (612, None), (612, None)]
-
all_tasks¶ Return self.pool._all_futures
-
delete(url, callback=None, retry=0, **kwargs)[source]¶ Similar to requests.delete, but return as NewFuture.
-
get(url, params=None, callback=None, retry=0, **kwargs)[source]¶ Similar to requests.get, but return as NewFuture.
-
head(url, callback=None, retry=0, **kwargs)[source]¶ Similar to requests.head, but return as NewFuture.
-
options(url, callback=None, retry=0, **kwargs)[source]¶ Similar to requests.options, but return as NewFuture.
-
patch(url, callback=None, retry=0, **kwargs)[source]¶ Similar to requests.patch, but return as NewFuture.
-
post(url, data=None, json=None, callback=None, retry=0, **kwargs)[source]¶ Similar to requests.post, but return as NewFuture.
-
put(url, data=None, callback=None, retry=0, **kwargs)[source]¶ Similar to requests.put, but return as NewFuture.
-
request(method, url, callback=None, retry=0, **kwargs)[source]¶ Similar to requests.request, but return as NewFuture.
-
x¶ Return self.pool.x
torequests.dummy¶
torequests.utils¶
-
torequests.utils.parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None)[source]¶ Parse a query given as a string argument.
Arguments:
qs: percent-encoded query string to be parsed
- keep_blank_values: flag indicating whether blank values in
- percent-encoded queries should be treated as blank strings. A true value indicates that blanks should be retained as blank strings. The default false value indicates that blank values are to be ignored and treated as if they were not included.
- strict_parsing: flag indicating what to do with parsing errors.
- If false (the default), errors are silently ignored. If true, errors raise a ValueError exception.
- encoding and errors: specify how to decode percent-encoded sequences
- into Unicode characters, as accepted by the bytes.decode() method.
- max_num_fields: int. If set, then throws a ValueError if there
- are more than n fields read by parse_qsl().
Returns a dictionary.
-
torequests.utils.parse_qsl(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None)[source]¶ Parse a query given as a string argument.
Arguments:
qs: percent-encoded query string to be parsed
- keep_blank_values: flag indicating whether blank values in
- percent-encoded queries should be treated as blank strings. A true value indicates that blanks should be retained as blank strings. The default false value indicates that blank values are to be ignored and treated as if they were not included.
- strict_parsing: flag indicating what to do with parsing errors. If
- false (the default), errors are silently ignored. If true, errors raise a ValueError exception.
- encoding and errors: specify how to decode percent-encoded sequences
- into Unicode characters, as accepted by the bytes.decode() method.
- max_num_fields: int. If set, then throws a ValueError
- if there are more than n fields read by parse_qsl().
Returns a list, as G-d intended.
-
torequests.utils.urlparse(url, scheme='', allow_fragments=True)[source]¶ Parse a URL into 6 components: <scheme>://<netloc>/<path>;<params>?<query>#<fragment> Return a 6-tuple: (scheme, netloc, path, params, query, fragment). Note that we don’t break the components up in smaller bits (e.g. netloc is a single string) and we don’t expand % escapes.
-
torequests.utils.quote('abc def') → 'abc%20def'[source]¶ Each part of a URL, e.g. the path info, the query, etc., has a different set of reserved characters that must be quoted.
RFC 3986 Uniform Resource Identifiers (URI): Generic Syntax lists the following reserved characters.
- reserved = “;” | “/” | “?” | “:” | “@” | “&” | “=” | “+” |
- “$” | “,” | “~”
Each of these characters is reserved in some component of a URL, but not necessarily in all of them.
Python 3.7 updates from using RFC 2396 to RFC 3986 to quote URL strings. Now, “~” is included in the set of reserved characters.
By default, the quote function is intended for quoting the path section of a URL. Thus, it will not encode ‘/’. This character is reserved, but in typical usage the quote function is being called on a path where the existing slash characters are used as reserved characters.
string and safe may be either str or bytes objects. encoding and errors must not be specified if string is a bytes object.
The optional encoding and errors parameters specify how to deal with non-ASCII characters, as accepted by the str.encode method. By default, encoding=’utf-8’ (characters are encoded with UTF-8), and errors=’strict’ (unsupported characters raise a UnicodeEncodeError).
-
torequests.utils.quote_plus(string, safe='', encoding=None, errors=None)[source]¶ Like quote(), but also replace ‘ ‘ with ‘+’, as required for quoting HTML form values. Plus signs in the original string are escaped unless they are included in safe. It also does not have safe default to ‘/’.
-
torequests.utils.unquote(string, encoding='utf-8', errors='replace')[source]¶ Replace %xx escapes by their single-character equivalent. The optional encoding and errors parameters specify how to decode percent-encoded sequences into Unicode characters, as accepted by the bytes.decode() method. By default, percent-encoded sequences are decoded with UTF-8, and invalid sequences are replaced by a placeholder character.
unquote(‘abc%20def’) -> ‘abc def’.
-
torequests.utils.unquote_plus(string, encoding='utf-8', errors='replace')[source]¶ Like unquote(), but also replace plus signs by spaces, as required for unquoting HTML form values.
unquote_plus(‘%7e/abc+def’) -> ‘~/abc def’
-
torequests.utils.urljoin(base, url, allow_fragments=True)[source]¶ Join a base URL and a possibly relative URL to form an absolute interpretation of the latter.
-
torequests.utils.urlsplit(url, scheme='', allow_fragments=True)[source]¶ Parse a URL into 5 components: <scheme>://<netloc>/<path>?<query>#<fragment> Return a 5-tuple: (scheme, netloc, path, query, fragment). Note that we don’t break the components up in smaller bits (e.g. netloc is a single string) and we don’t expand % escapes.
-
torequests.utils.urlunparse(components)[source]¶ Put a parsed URL back together again. This may result in a slightly different, but equivalent URL, if the URL that was parsed originally had redundant delimiters, e.g. a ? with an empty query (the draft states that these are equivalent).
-
torequests.utils.escape(s, quote=True)[source]¶ Replace special characters “&”, “<” and “>” to HTML-safe sequences. If the optional flag quote is true (the default), the quotation mark characters, both double quote (“) and single quote (‘) characters are also translated.
-
torequests.utils.unescape(s)[source]¶ Convert all named and numeric character references (e.g. >, >, &x3e;) in the string s to the corresponding unicode characters. This function uses the rules defined by the HTML 5 standard for both valid and invalid character references, and the list of HTML 5 named character references defined in html.entities.html5.
-
torequests.utils.print_mem(unit=None, callback=<function print_info>, rounded=2)[source]¶ Show the proc-mem-cost with psutil, use this only for lazinesssss.
Parameters: unit – B, KB, MB, GB.
-
torequests.utils.curlparse(string, encoding='utf-8')[source]¶ - Translate curl-string into dict of request. Do not support file upload which contains @file_path.
param string: standard curl-string, like r’‘’curl …’‘’. param encoding: encoding for post-data encoding.
Basic Usage:
>>> from torequests.utils import curlparse >>> curl_string = '''curl 'https://p.3.cn?skuIds=1&nonsense=1&nonce=0' -H 'Pragma: no-cache' -H 'DNT: 1' -H 'Accept-Encoding: gzip, deflate' -H 'Accept-Language: zh-CN,zh;q=0.9' -H 'Upgrade-Insecure-Requests: 1' -H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8' -H 'Cache-Control: no-cache' -H 'Referer: https://p.3.cn?skuIds=1&nonsense=1&nonce=0' -H 'Cookie: ASPSESSIONIDSQRRSADB=MLHDPOPCAMBDGPFGBEEJKLAF' -H 'Connection: keep-alive' --compressed''' >>> request_args = curlparse(curl_string) >>> request_args {'url': 'https://p.3.cn?skuIds=1&nonsense=1&nonce=0', 'headers': {'Pragma': 'no-cache', 'Dnt': '1', 'Accept-Encoding': 'gzip, deflate', 'Accept-Language': 'zh-CN,zh;q=0.9', 'Upgrade-Insecure-Requests': '1', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8', 'Cache-Control': 'no-cache', 'Referer': 'https://p.3.cn?skuIds=1&nonsense=1&nonce=0', 'Cookie': 'ASPSESSIONIDSQRRSADB=MLHDPOPCAMBDGPFGBEEJKLAF', 'Connection': 'keep-alive'}, 'method': 'get'} >>> import requests >>> requests.request(**request_args) <Response [200]>
-
class
torequests.utils.Null(*args, **kwargs)[source]¶ Null instance will return self when be called, it will alway be False.
-
torequests.utils.null¶ Null instance will return self when be called, it will alway be False.
-
torequests.utils.itertools_chain(*iterables)[source]¶ For the shortage of Python2’s, Python3: from itertools import chain.
-
torequests.utils.slice_into_pieces(seq, n)[source]¶ Slice a sequence into n pieces, return a generation of n pieces
-
torequests.utils.slice_by_size(seq, size)[source]¶ Slice a sequence into chunks, return as a generation of chunks with size.
-
torequests.utils.ttime(timestamp=None, tzone=None, fail='', fmt='%Y-%m-%d %H:%M:%S')[source]¶ Translate timestamp into human-readable: %Y-%m-%d %H:%M:%S.
Parameters: - timestamp – the timestamp float, or time.time() by default.
- tzone – time compensation, int(-time.timezone / 3600) by default, (can be set with Config.TIMEZONE).
- fail – while raising an exception, return it.
- fmt – %Y-%m-%d %H:%M:%S, %z not work.
Return type: str
>>> ttime() 2018-03-15 01:24:35 >>> ttime(1486572818.421858323) 2017-02-09 00:53:38
-
torequests.utils.ptime(timestr=None, tzone=None, fail=0, fmt='%Y-%m-%d %H:%M:%S')[source]¶ Translate %Y-%m-%d %H:%M:%S into timestamp.
Parameters: - timestr – string like 2018-03-15 01:27:56, or time.time() if not set.
- tzone – time compensation, int(-time.timezone / 3600) by default, (can be set with Config.TIMEZONE).
- fail – while raising an exception, return it.
- fmt – %Y-%m-%d %H:%M:%S, %z not work.
Return type: int
>>> ptime('2018-03-15 01:27:56') 1521048476
-
torequests.utils.split_seconds(seconds)[source]¶ Split seconds into [day, hour, minute, second, ms]
divisor: 1, 24, 60, 60, 1000
units: day, hour, minute, second, ms
>>> split_seconds(6666666) [77, 3, 51, 6, 0]
-
torequests.utils.timeago(seconds=0, accuracy=4, format=0, lang='en', short_name=False)[source]¶ Translate seconds into human-readable.
param seconds: seconds (float/int). param accuracy: 4 by default (units[:accuracy]), determine the length of elements. param format: index of [led, literal, dict]. param lang: en or cn. param units: day, hour, minute, second, ms. >>> timeago(93245732.0032424, 5) '1079 days, 05:35:32,003' >>> timeago(93245732.0032424, 4, 1) '1079 days 5 hours 35 minutes 32 seconds' >>> timeago(-389, 4, 1) '-6 minutes 29 seconds 0 ms'
-
torequests.utils.timepass(seconds=0, accuracy=4, format=0, lang='en', short_name=False)¶ Translate seconds into human-readable.
param seconds: seconds (float/int). param accuracy: 4 by default (units[:accuracy]), determine the length of elements. param format: index of [led, literal, dict]. param lang: en or cn. param units: day, hour, minute, second, ms. >>> timeago(93245732.0032424, 5) '1079 days, 05:35:32,003' >>> timeago(93245732.0032424, 4, 1) '1079 days 5 hours 35 minutes 32 seconds' >>> timeago(-389, 4, 1) '-6 minutes 29 seconds 0 ms'
-
torequests.utils.md5(string, n=32, encoding='utf-8', skip_encode=False)[source]¶ str(obj) -> md5_string
Parameters: - string – string to operate.
- n – md5_str length.
>>> from torequests.utils import md5 >>> md5(1, 10) '923820dcc5' >>> md5('test') '098f6bcd4621d373cade4e832627b4f6'
-
class
torequests.utils.Counts(start=0, step=1)[source]¶ Counter for counting the times been called
>>> from torequests.utils import Counts >>> cc = Counts() >>> cc.x 1 >>> cc.x 2 >>> cc.now 2 >>> cc.current 2 >>> cc.sub() 1
-
c¶
-
current¶
-
now¶
-
s¶
-
start¶
-
step¶
-
total¶
-
x¶
-
-
torequests.utils.unique(seq, key=None, return_as=None)[source]¶ Unique the seq and keep the order.
- Instead of the slow way:
- lambda seq: (x for index, x in enumerate(seq) if seq.index(x)==index)
Parameters: - seq – raw sequence.
- return_as – generator for default, or list / set / str…
>>> from torequests.utils import unique >>> a = [1,2,3,4,2,3,4] >>> unique(a) <generator object unique.<locals>.<genexpr> at 0x05720EA0> >>> unique(a, str) '1234' >>> unique(a, list) [1, 2, 3, 4]
-
torequests.utils.unparse_qsl(qsl, sort=False, reverse=False)[source]¶ Reverse conversion for parse_qsl
-
class
torequests.utils.Regex(ensure_mapping=False)[source]¶ Register some objects(like functions) to the regular expression.
>>> from torequests.utils import Regex, re >>> reg = Regex() >>> @reg.register_function('http.*cctv.*') ... def mock(): ... pass ... >>> reg.register('http.*HELLOWORLD', 'helloworld', instances='http://helloworld', flags=re.I) >>> reg.register('http.*HELLOWORLD2', 'helloworld2', flags=re.I) >>> reg.find('http://cctv.com') [<function mock at 0x031FC5D0>] >>> reg.match('http://helloworld') ['helloworld'] >>> reg.match('non-http://helloworld') [] >>> reg.search('non-http://helloworld') ['helloworld'] >>> len(reg.search('non-http://helloworld2')) 2 >>> print(reg.show_all()) ('http.*cctv.*') => => <class 'function'> mock "" ('http.*HELLOWORLD', re.IGNORECASE) => http://helloworld => <class 'str'> helloworld ('http.*HELLOWORLD2', re.IGNORECASE) => => <class 'str'> helloworld2
-
register(patterns, obj=None, instances=None, **reg_kwargs)[source]¶ Register one object which can be matched/searched by regex.
Parameters: - patterns – a list/tuple/set of regex-pattern.
- obj – return it while search/match success.
- instances – instance list will search/match the patterns.
- reg_kwargs – kwargs for re.compile.
-
-
class
torequests.utils.UA[source]¶ Some common User-Agents for crawler.
Android, iPhone, iPad, Firefox, Chrome, IE6, IE9
-
Android= 'Mozilla/5.0 (Linux; Android 5.1.1; Nexus 6 Build/LYZ28E) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Mobile Safari/537.36'¶
-
Chrome= 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36'¶
-
Firefox= 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:54.0) Gecko/20100101 Firefox/54.0'¶
-
IE6= 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)'¶
-
IE9= 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0;'¶
-
WECHAT_ANDROID= 'Mozilla/5.0 (Linux; Android 5.0; SM-N9100 Build/LRX21V) > AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 > Chrome/37.0.0.0 Mobile Safari/537.36 > MicroMessenger/6.0.2.56_r958800.520 NetType/WIFI'¶
-
WECHAT_IOS= 'Mozilla/5.0 (iPhone; CPU iPhone OS 5_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Mobile/9B176 MicroMessenger/4.3.2'¶
-
iPad= 'Mozilla/5.0 (iPad; CPU OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1'¶
-
iPhone= 'Mozilla/5.0 (iPhone; CPU iPhone OS 10_3 like Mac OS X) AppleWebKit/602.1.50 (KHTML, like Gecko) CriOS/56.0.2924.75 Mobile/14E5239e Safari/602.1'¶
-
-
torequests.utils.try_import(module_name, names=None, default=<class 'torequests.exceptions.ImportErrorModule'>, warn=True)[source]¶ Try import module_name, except ImportError and return default, sometimes to be used for catch ImportError and lazy-import.
-
torequests.utils.ensure_request(request)[source]¶ Used for requests.request / Requests.request with ensure_request(request) :param request: dict or curl-string or url :type request: [dict] :return: dict of request :rtype: [dict]
Basic Usage:
>>> from torequests.utils import ensure_request >>> ensure_request('''curl http://test.com''') {'url': 'http://test.com', 'method': 'get'} >>> ensure_request('http://test.com') {'method': 'get', 'url': 'http://test.com'} >>> ensure_request({'method': 'get', 'url': 'http://test.com'}) {'method': 'get', 'url': 'http://test.com'} >>> ensure_request({'url': 'http://test.com'}) {'url': 'http://test.com', 'method': 'get'}
-
class
torequests.utils.Timer(name=None, log_func=None, default_timer=None, rounding=None, readable=None, log_after_del=True, stack_level=1)[source]¶ - Usage:
- init Timer anywhere:
- such as head of function, or head of module, then it will show log after del it by gc.
param name: be used in log or None. param log_func: some function to show process. param default_timer: use timeit.default_timer by default. param rounding: None, or seconds will be round(xxx, rounding) param readable: None, or use timepass: readable(cost_seconds) -> 00:00:01,234 Basic Usage:
from torequests.utils import Timer import time Timer() @Timer.watch() def test(a=1): Timer() time.sleep(1) def test_inner(): t = Timer('test_non_del') time.sleep(1) t.x test_inner() test(3) time.sleep(1) # [2018-03-10 02:16:48]: Timer [00:00:01]: test_non_del, start at 2018-03-10 02:16:47. # [2018-03-10 02:16:48]: Timer [00:00:02]: test(a=3), start at 2018-03-10 02:16:46. # [2018-03-10 02:16:48]: Timer [00:00:02]: test(3), start at 2018-03-10 02:16:46. # [2018-03-10 02:16:49]: Timer [00:00:03]: <module>: __main__ (temp_code.py), start at 2018-03-10 02:16:46.
-
passed¶ Return the cost_seconds after starting up.
-
string¶ Only return the expect_string quietly.
-
x¶ Call self.log_func(self) and return expect_string.
-
class
torequests.utils.ClipboardWatcher(interval=0.2, callback=None)[source]¶ Watch clipboard with pyperclip, run callback while changed.
-
current¶ Return the current clipboard content.
-
x¶ Return self.watch()
-
-
class
torequests.utils.Saver(path=None, save_mode='json', auto_backup=False, **saver_args)[source]¶ Simple object persistent toolkit with pickle/json, if only you don’t care the performance and security. Do not set the key startswith “_”
Parameters: - path – if not set, will be ~/_saver.db. print(self._path) to show it. Set pickle’s protocol < 3 for compatibility between python2/3, but use -1 for performance and some other optimizations.
- save_mode – pickle / json.
>>> ss = Saver() >>> ss._path '/home/work/_saver.json' >>> ss.a = 1 >>> ss['b'] = 2 >>> str(ss) {'a': 1, 'b': 2, 'c': 3, 'd': 4} >>> del ss.b >>> str(ss) "{'a': 1, 'c': 3, 'd': 4}" >>> ss._update({'c': 3, 'd': 4}) >>> ss Saver(path="/home/work/_saver.json"){'a': 1, 'c': 3, 'd': 4}
-
torequests.utils.guess_interval(nums, accuracy=0)[source]¶ Given a seq of number, return the median, only calculate interval >= accuracy.
Basic Usage:
from torequests.utils import guess_interval import random seq = [random.randint(1, 100) for i in range(20)] print(guess_interval(seq, 5)) # sorted_seq: [2, 10, 12, 19, 19, 29, 30, 32, 38, 40, 41, 54, 62, 69, 75, 79, 82, 88, 97, 99] # diffs: [8, 7, 10, 6, 13, 8, 7, 6, 6, 9] # median: 8
-
torequests.utils.split_n(string, seps, reg=False)[source]¶ Split strings into n-dimensional list.
Basic Usage:
from torequests.utils import split_n ss = '''a b c d e f 1 2 3 4 5 6 a b c d e f 1 2 3 4 5 6 a b c d e f 1 2 3 4 5 6''' print(split_n(ss, ('\n', ' ', ' '))) # [[['a', 'b', 'c'], ['d', 'e', 'f'], ['1', '2', '3'], ['4', '5', '6']], [['a', 'b', 'c'], ['d', 'e', 'f'], ['1', '2', '3'], ['4', '5', '6']], [['a', 'b', 'c'], ['d', 'e', 'f'], ['1', '2', '3'], ['4', '5', '6']]] print(split_n(ss, ['\s+'], reg=1)) # ['a', 'b', 'c', 'd', 'e', 'f', '1', '2', '3', '4', '5', '6', 'a', 'b', 'c', 'd', 'e', 'f', '1', '2', '3', '4', '5', '6', 'a', 'b', 'c', 'd', 'e', 'f', '1', '2', '3', '4', '5', '6']
-
class
torequests.utils.Cooldown(init_items=None, interval=0, born_at_now=False)[source]¶ Thread-safe Cooldown toolkit.
Parameters: - init_items – iterables to add into the default queue at first.
- interval – each item will cooldown interval seconds before return.
- born_at_now – if be set True, the item.use_at will be set time.time() instead of 0 when adding to queue at the first time.
>>> from torequests.logs import print_info >>> cd = Cooldown(range(1, 3), interval=2) >>> cd.add_items([3, 4]) >>> cd.add_item(5) >>> for _ in range(7): ... print_info(cd.get(1, 'timeout')) [2019-01-17 01:50:59] pyld.py(152): 1 [2019-01-17 01:50:59] pyld.py(152): 3 [2019-01-17 01:50:59] pyld.py(152): 5 [2019-01-17 01:50:59] pyld.py(152): 2 [2019-01-17 01:50:59] pyld.py(152): 4 [2019-01-17 01:51:00] pyld.py(152): timeout [2019-01-17 01:51:01] pyld.py(152): 1 >>> cd.size 5
-
all_items¶
-
size¶
-
torequests.utils.curlrequests(curl_string, **kwargs)[source]¶ Use tPool to request for curl string. If kwargs contains the req which hasattr request method, like req=requests.
Parameters: - curl_string (dict) – standard curl string.
- kwargs – valid kwargs for tPool.
Basic Usage:
from torequests.utils import curlrequests r = curlrequests('''curl 'http://p.3.cn/' -H 'Connection: keep-alive' -H 'Cache-Control: max-age=0' -H 'Upgrade-Insecure-Requests: 1' -H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36' -H 'DNT: 1' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8' -H 'Accept-Encoding: gzip, deflate' -H 'Accept-Language: zh-CN,zh;q=0.9,en;q=0.8' -H 'If-None-Match: "55dd9090-264"' -H 'If-Modified-Since: Wed, 26 Aug 2015 10:10:24 GMT' --compressed''', retry=1) print(r.text)
-
torequests.utils.sort_url_query(url, reverse=False, _replace_kwargs=None)[source]¶ sort url query args. _replace_kwargs is a dict to update attributes before sorting (such as scheme / netloc…). http://www.google.com?b=2&z=26&a=1 => http://www.google.com?a=1&b=2&z=26
-
torequests.utils.retry(tries=1, exceptions: Tuple[Type[BaseException]] = (<class 'Exception'>, ), catch_exception=False)[source]¶
-
torequests.utils.get_readable_size(input_num, unit=None, rounded=<object object>, format='%s %s', units=None, carry=1024)[source]¶ Show the num readable with unit.
Parameters: - input_num (float, int) – raw number
- unit (str, optional) – target unit, defaults to None for auto set.
- rounded (None or int, optional) – defaults to NotSet return raw float without round.
- format (str, optional) – output string format, defaults to “%s %s”
- units (list, optional) – unit list, defaults to None for computer storage unit
- carry (int, optional) – carry a number as in adding, defaults to 1024
Returns: string for input_num with unit.
Return type: str
torequests.configs¶
torequests.crawlers¶
torequests.exceptions¶
-
exception
torequests.exceptions.CommonException(name)[source]¶ This Exception mainly used for bool(self) is False, and not callable.
torequests.logs¶
-
torequests.logs.init_logger(name='', handler_path_levels=None, level=20, formatter=None, formatter_str=None, datefmt='%Y-%m-%d %H:%M:%S')[source]¶ Add a default handler for logger.
Args:
name = ‘’ or logger obj.
handler_path_levels = [[‘loggerfile.log’,13],[‘’,’DEBUG’],[‘’,’info’],[‘’,’notSet’]] # [[path,level]]
level = the least level for the logger.
- formatter = logging.Formatter(
- ‘%(levelname)-7s %(asctime)s %(name)s (%(filename)s: %(lineno)s): %(message)s’,
- “%Y-%m-%d %H:%M:%S”)
formatter_str = ‘%(levelname)-7s %(asctime)s %(name)s (%(funcName)s: %(lineno)s): %(message)s’
- custom formatter:
- %(asctime)s %(created)f %(filename)s %(funcName)s %(levelname)s %(levelno)s %(lineno)s %(message)s %(module)s %(name)s %(pathname)s %(process)s %(relativeCreated)s %(thread)s %(threadName)s
-
torequests.logs.print_info(*messages, **kwargs)[source]¶ - Simple print use logger, print with time / file / line_no.
param sep: sep of messages, ” ” by default.
Basic Usage:
print_info(1, 2, 3) print_info(1, 2, 3) print_info(1, 2, 3) # [2018-10-24 19:12:16] temp_code.py(7): 1 2 3 # [2018-10-24 19:12:16] temp_code.py(8): 1 2 3 # [2018-10-24 19:12:16] temp_code.py(9): 1 2 3
torequests.frequency_controller.sync_tools¶
-
class
torequests.frequency_controller.sync_tools.Frequency(n=None, interval=0)[source]¶ Frequency controller, means concurrent running n tasks every interval seconds.
Basic Usage:
from torequests.frequency_controller.sync_tools import Frequency from concurrent.futures import ThreadPoolExecutor from time import time # limit to 2 concurrent tasks each 1 second frequency = Frequency(2, 1) def test(): with frequency: return time() now = time() pool = ThreadPoolExecutor() tasks = [] for _ in range(5): tasks.append(pool.submit(test)) result = [task.result() for task in tasks] assert result[0] - now < 1 assert result[1] - now < 1 assert result[2] - now > 1 assert result[3] - now > 1 assert result[4] - now > 2 assert frequency.to_dict() == {'n': 2, 'interval': 1} assert frequency.to_list() == [2, 1]
-
TIMER()¶ time() -> floating point number
Return the current time in seconds since the Epoch. Fractions of a second may be present if the system clock provides them.
-
classmethod
ensure_frequency(frequency)[source]¶ Ensure the given args is Frequency.
Parameters: frequency (Frequency / dict / list / tuple) – args to create a Frequency instance. Returns: Frequency instance Return type: Frequency
-
gen¶
-
interval¶
-
lock¶
-
n¶
-
repr¶
-
torequests.frequency_controller.async_tools¶
-
class
torequests.frequency_controller.async_tools.AsyncFrequency(n=None, interval=0)[source]¶ AsyncFrequency controller, means concurrent running n tasks every interval seconds.
Basic Usage:
from torequests.frequency_controller.async_tools import AsyncFrequency from asyncio import ensure_future, get_event_loop from time import time async def test_async(): frequency = AsyncFrequency(2, 1) async def task(): async with frequency: return time() now = time() tasks = [ensure_future(task()) for _ in range(5)] result = [await task for task in tasks] assert result[0] - now < 1 assert result[1] - now < 1 assert result[2] - now > 1 assert result[3] - now > 1 assert result[4] - now > 2 assert frequency.to_dict() == {'n': 2, 'interval': 1} assert frequency.to_list() == [2, 1] get_event_loop().run_until_complete(test_async())
-
TIMER()¶ time() -> floating point number
Return the current time in seconds since the Epoch. Fractions of a second may be present if the system clock provides them.
-
classmethod
ensure_frequency(frequency)[source]¶ Ensure the given args is AsyncFrequency.
Parameters: frequency (AsyncFrequency / dict / list / tuple) – args to create a AsyncFrequency instance. Returns: AsyncFrequency instance Return type: AsyncFrequency
-
gen¶
-
interval¶
-
lock¶
-
n¶
-
repr¶
-