a5.gif

keoki

GF  2020-11-28 16:34
(一些脚本:https://bbs.imoutolove.me/read.php?tid=1353704)

分享可以免费白嫖某个福利网站的方法(更新了爬虫,有兴趣老哥可以尝试爬取资源)

更新,爬虫贴出来了,试爬取了前100页列表的1000个资源,结果见帖中附件:
https://bbs.north-plus.net/u.php?action-topic-uid-1156494.html




https://www.flhk.xyz/




偶然发现这个福利网站的的资源下载链接存在于HTML源码中,只不过页面没有显示出来:



这里点击CTRL + U打开页面源码,可以看到在<meta>标签里有下载链接和解压密码:




下面这一行:
复制代码
  1. <meta name="description" content="下载地址: https://pan.baidu.com/s/1gmKSva8pgMnwrr6vlD6_gw 提取码:nj26 解压密码:4956(下载完后缀名改成zip)">


这个站的资源还挺多的,如果哪位想的话,写个简单的爬虫就可以把整个站的资源都抓下来,不知道这个漏洞能用多久,毕竟挺低级的,估计站长不太懂技术,一键搭建WordPress网站。

各位抓紧了      
 
 



更新爬虫,有兴趣老哥可以尝试爬取资源,测试爬取5页所有资源用时14秒。

复制代码
  1. import asyncio
  2. from lxml import etree
  3. # import re
  4. import aiohttp
  5. import time
  6. # import uvloop
  7. import tqdm
  8. base_url = 'https://www.flhk.xyz/page/{}'
  9. # work_lst = []
  10. # asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
  11. async def get_dir_page(page, session):
  12.     try:
  13.         async with session.get(url=base_url.format(page)) as resp:
  14.             text = await resp.text(encoding='utf-8')
  15.             return text
  16.     except:
  17.         return None
  18. async def get_link_passwd(href, title, session):
  19.     async with session.get(href) as resp:
  20.         text = await resp.text(encoding='utf-8')
  21.         html = etree.HTML(text)
  22.         meta_descrp = html.xpath('//meta[@name="description"]/@content')
  23.         if meta_descrp:
  24.             link_and_passwd = meta_descrp[0]
  25.             print('Get link and passwd:\n{} \n  {} {}'.format(
  26.                 link_and_passwd, title, href))
  27.             return title, href, link_and_passwd
  28.         else:
  29.             print('No download link available for {} {}'.format(title, href))
  30. async def Main():
  31.     start = time.time()
  32.     # global work_lst
  33.     async with aiohttp.ClientSession() as session:
  34.         tasks = [get_dir_page(page, session) for page in range(1, 5)]
  35.         for rslt in tqdm.tqdm(asyncio.as_completed(tasks), total=len(tasks)):
  36.             text = await rslt
  37.             if text:
  38.                 html = etree.HTML(text)
  39.                 ajax_load_divs = html.xpath(
  40.                     '//div[@class="ajax-load-con content wow fadeInUp"]')
  41.                 sub_tasks_lst = []
  42.                 for div in ajax_load_divs:
  43.                     h2 = div.xpath('.//h2')[0]
  44.                     href = h2.xpath('./a/@href')[0]
  45.                     title = h2.xpath('./a/@title')[0]
  46.                     sub_tasks_lst.append((href, title, session))
  47.                 sub_tasks = [get_link_passwd(*tp) for tp in sub_tasks_lst]
  48.                 for f in asyncio.as_completed(sub_tasks):
  49.                     rslt_tp = await f
  50.                     if rslt_tp:
  51.                         with open("link_passwds.txt", "a+",
  52.                                   encoding='utf-8') as file:
  53.                             file.write(rslt_tp[1] + ": " + rslt_tp[0] + '\n')
  54.                             file.write(rslt_tp[2] + '\n')
  55.                             file.write('\n')
  56.     end = time.time()
  57.     total_secs = end - start
  58.     print('total_secs:', total_secs)
  59.     return 'done'
  60. loop = asyncio.get_event_loop()
  61. try:
  62.     rslt = loop.run_until_complete(Main())
  63.     print(rslt)
  64. finally:
  65.     loop.close()


结果示意:





最后安利一下 (更新2020/11/28) 自己写的直播录制工具(支持斗鱼,b站, 虎牙), 可抓取显示弹幕
https://bbs.north-plus.net/read.php?tid-1017998.html
欢迎各位测试

1164862.png

89e24ee3

B1F  2020-11-29 00:39
(諦めない限り 奇跡は何度でも起こるんだ)
python代码写好了   
抓了主页的一坨东西(以下运行结果,代码见末尾)
共找到21个地址
1
调教18岁萝莉[晴天的调教日常@]按摩棒玩到尿不尽程度[3FSVv/5v/6.09G]
下载地址: https://pan.baidu.com/s/1gmKSva8pgMnwrr6vlD6_gw 提取码:nj26 解压密码:4956(下载完后缀名改成zip)
2
[小小心动]11.17-19超极品新人大胸大屁股首次漏出[RSKaz/nv/1.4G]
下载地址: https://pan.baidu.com/s/1f1MPKld7HDGNK0272FZ3eQ 提取码:t4vt 解压密码:NPFYGdWx4umPYzYgLSxPaDd7
3
[迷人的猫老师]港风小姐姐大方有气质,丝袜美腿,zw时爱攥着脚,表情想当诱惑[LpgBF/多v,41g]
下载地址: https://pan.baidu.com/s/1WlQwhyQsddYT3Mu2LfwxFA 提取码:7zv2 解压密码:见压缩包名字
4
一坊美女可盐可甜一多房自卫大绣20201126
下载地址: https://pan.baidu.com/s/1k7oJH0WWqp6PfyWucV46SQ 提取码:dcme 解压密码:det8.com
5
一坊美女俗人一多房自卫大绣20201126
下载地址: https://pan.baidu.com/s/1GfaWyf9mkqVqCLdpLVjqJg 提取码:hari 解压密码:det8.com
6
一坊美女小可爱咻咻一多房情侣大绣20201126
下载地址: https://pan.baidu.com/s/1i1_kFWrvodcisQtwEueaLA 提取码:kuj6 解压密码:det8.com
7
[TuiGirl推女郎]TuiGirl推女郎整理54-607套合集1.28G
下载地址: https://pan.baidu.com/s/142xFLb-1NqC1W9JLZVgIrw 提取码:r1bb 解压密码:1288(zip1改zip再解压)
8
[精选国产付费图片视频]s116官方售价10美元JVID爱妃深喉榨汁机口技女神600M
下载地址: https://pan.baidu.com/s/1vcsPPzTXAU3QkrF30EvVEQ 提取码:ifgd 解压密码:1288(zip1改zip再解压)
9
[百度云泄密]情趣内衣高颜值妹子啪啪啪身材颜值超棒!
下载地址: 网址已变更,请收藏:https://www.flhk.xyz/https://pan.baidu.com/s/1vB4G7UxQIS-TwIgSC1wFVg 提取码:re3h 解压密码:yueyingzi.com
10
[泄密]良家小美女被男友出卖裸照与视频流出
下载地址: 网址已变更,请收藏:https://www.flhk.xyz/https://pan.baidu.com/s/1VA_n_F4LxP_6jjeuMuzBog 提取码:2uua 解压密码:yueyingzi.com
11
绝世女神[HeyuZhang]14部小视频合集,她招聘中国男3小时拍AV
下载地址: 网址已变更,请收藏:https://www.flhk.xyz/https://pan.baidu.com/s/1UNW75Jw05etsYqJ2rj03pQ 提取码:w31h 解压密码:G69zeUlGSXuKbOhqN5grKSJz
12
[自购原版无印]!!!重磅!!!多人寻求的高端厕拍[咔嚓客独家原创厕拍]全集,美女如云[87V/84P]
下载地址: 网址已变更,请收藏:https://www.flhk.xyz/https://pan.baidu.com/s/1kZZA6P20X6jBf0rhgtO6mg 提取码:xgx2 解压密码:yueyingzi.com
13
巨乳模特软软roro最新福利视频5V
下载地址: 网址已变更,请收藏:https://www.flhk.xyz/https://pan.baidu.com/s/1Puukc5i_biUs3f1euJ3evg 提取码:fz5k 解压密码:flweek
14
性感大奶美女主播近距离特写椅子上道具自慰大秀
下载地址: 网址已变更,请收藏:https://www.flhk.xyz/http://pan.baidu.com/s/1Kw_4B2hUenpuB-TSKtPSoQ 提取码:a656
15
[网曝门事件]日本最年轻议员吉武昭博和女高中生性爱门42部[TT]
下载地址: 网址已变更,请收藏:https://www.flhk.xyz/http://pan.baidu.com/s/1K8961QsSTO_aGx6iS3W1yA 提取码:igja 解压密码:sis
16
17
黑丝情趣小母狗全程露脸大秀直播互动,奶子不小撅着屁股让哥哥来草,道具摩擦骚逼叫声骚浪[233.1M]
下载地址: 网址已变更,请收藏:https://www.flhk.xyz/https://pan.baidu.com/s/19zYmt4RNh52lrLgsvXeaUA 提取码:0r2c 解压密码:HlSp8cwJ
18
一粒小又丸新人高颜值1V
下载地址: 网址已变更,请收藏:https://www.flhk.xyz/https://pan.baidu.com/s/1JEasgfm0_zxaCNCSeUdR9w 提取码:bqaw 解压密码:flweek
19
高颜值清纯女大学生林悦月-紫薇高潮洗澡4天7小时合集
下载地址: 网址已变更,请收藏:https://www.flhk.xyz/https://pan.baidu.com/s/1PKJKqqxjOwVczRcbIy_wKg  提取码:4uu8
20
果哥尺度写真作品 嫩模梦露
下载地址: 网址已变更,请收藏:https://www.flhk.xyz/http://c1.down.heatimg.info/res/20190628/reej%20(4).rar 解压密码:bbbt999.club
21
단비旦比-06
下载地址: 网址已变更,请收藏:https://www.flhk.xyz/https://pan.baidu.com/s/1fHk1UxIN1dyjfyKYOIt3ug 提取码:u6p0
end

本人萌新,求轻喷!
代码如下:
import bs4
import requests
import time
import sys
res = requests.get("https://www.flhk.xyz/")
html = res.text
soup = bs4.BeautifulSoup(html, 'html.parser')
nlinks = []
cnt = -1
for link in soup.find_all('a'):
    t = link.get('href')
    if(t[-10:-6].isdigit() and (cnt<0 or (cnt>=0 and t[-10:-6] != nlinks[cnt][-10:-6]))):
        nlinks.append(t)
        cnt = cnt + 1
print("共找到" + str(cnt+1) + "个地址")
i = 0
for nlink in nlinks:
    try:
        i = i + 1
        print(i)
        html1 = requests.get(nlink).text
        soup1 = bs4.BeautifulSoup(html1, 'html.parser')
        print(soup1.title.string[:-8])
        print(str(soup1.find_all('meta')[2])[15:-22])
    except:
        continue
    time.sleep(.5)
print("end")