爬取LOL皮肤图片

Tisou1 2020-03-26 原文

1.进入LOL官网,让后进入”游戏资料”的”资料库”

2.F12检查网页,在netwoork中找到hero_list.js

可以看到通过Ajax请求,获得了英雄列表

3.随便点击几个英雄头像,然后可以发现图片地址规律

https://game.gtimg.cn/images/lol/act/img/skin/big7004.jpg这是妖姬的一个皮肤,

https://game.gtimg.cn/images/lol/act/img/skin/big64005.jpg这是李青的一个皮肤

可以发现big后面跟的是英雄ID和皮肤ID的拼接

4.皮肤ID我们在hero_list.js中就可以找到,至于皮肤ID我们可以设置一个上限值,进行循环测试,一般皮肤都会在25个以内(这里不包括炫彩皮肤)

5.将地址进行拼接,就可以进行下载了.

下面是源码,

import json
import os
import requests
import vthread


@vthread.pool(10)
def get_heroSkin(hero):
    skin_url = 'https://game.gtimg.cn/images/lol/act/img/skin/big{}.jpg'#皮肤地址,下面进行拼接
    heroId = hero['heroId']
    name = hero['name']
    alias = hero['alias']
    title = hero['title']
    if not os.path.exists(f'./lol/{name}-{alias}-{title}'):
        os.mkdir(f'./lol/{name}-{alias}-{title}')
    #拼接地址
    for i in range(25):
        if len(str(i)) == 1:
            i = f'00{i}'
        elif len(str(i)) == 2:
            i = f'0{i}'
        number_url = heroId + i
        response = requests.get(skin_url.format(number_url))
        if response.status_code == 200:
            with open(f'./lol/{name}-{alias}-{title}/{i}.jpg', 'wb') as f:
                f.write(response.content)

def get_heroId(url):
    headers = {

        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) '
                      'Chrome/79.0.3945.130 Safari/537.36 '
    }
    response = requests.get(url, headers=headers)
    html = response.content.decode('utf-8') # 是将原网页的utf-8转化为unicode(解码)
    #response.encoding = 'utf-8'进行utf-8编码
    html = json.loads(html)
    heros = html['hero']
    #print(type(html), html['hero'][0])
    for hero in heros:
        get_heroSkin(hero)


def main():
    # 获取英雄id
    url = 'https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js'
    if not os.path.exists('./lol'):
        os.mkdir('./lol')
    get_heroId(url)



if __name__ == '__main__':
    main()