php-curl 遇到 cloudflare防御

Please Wait…

Please enable cookies.

php 请求似乎缺少 ‘__cfruid’ cookie。

https://www.jianshu.com/p/bdb7e11e52db

方法一、使用浏览器模拟技术请求目标网站,例如:Selenium
方法二、一个专门为了绕过这个 CloudFlare 开发的 Python 库 cloudscraper

文档 https://pypi.org/project/cloudscraper/
pip install cloudscraper

—————————————

https://hechengwei.cn/archives/172

http://www.phpheidong.com/blog/article/134474/d155cd4979404c68b7b5/

通过 postman 执行命令时生成的 cookies 是这样的

__cfruid=longStringOfNumbers-shortStringOfNumbers; path=/; domain=.app.mobilecause.com; HttpOnly; Expires=Tue, 19 Jan 2038 03:14:07 GMT;
__cfduid=longStringOfNumbers; path=/; domain=.app.mobilecause.com; HttpOnly; Expires=Thu, 23 Jan 2020 04:54:50 GMT;

打开chrome查看跳转后的http信息,获取cookie字符串,如

__cfduid=df9ea159a1b1833b1d124576ec5ec682f1489310399;
_popfired=1; cf_clearance=5f8173592071f4dad5c7b46702365fcf8249f214-1493185177-1800;
sc_is_visitor_unique=rx10571718.1493186070.0CFEEEEFC3F4F5AA7A9719EC80CB35C.1.1.1.1.1.1.1.1.1

将该字符串复制到curl请求的header中

$header[]= ‘Cookie:’.$cookie_str;
curl_setopt($ch,CURLOPT_HTTPHEADER,$header);

 

$curl = curl_init();

curl_setopt_array($curl, array(
CURLOPT_URL => “https://app.mobilecause.com/api/v2/reports/transactions.json”,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_ENCODING => “”,
CURLOPT_MAXREDIRS => 10,
CURLOPT_TIMEOUT => 30,
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
CURLOPT_CUSTOMREQUEST => “GET”,
CURLOPT_POSTFIELDS => “”,
CURLOPT_COOKIESESSION => true,
CURLOPT_COOKIEFILE => “cookie.txt”,
CURLOPT_COOKIEJAR => “cookie.txt”,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HTTPHEADER => array(
‘Authorization: Token token=”test_token”‘,
“Content-Type: application/x-www-form-urlencoded”,
“cache-control: no-cache”,
),
));

curl_setopt($curl, CURLOPT_VERBOSE, true);

$response = curl_exec($curl);
$err = curl_error($curl);

curl_close($curl);

if ($err) {
echo “cURL Error #:” . $err;
} else {
echo $response;
}

—————————————

https://programmierfrage.com/items/php-curl-encounters-cloudflare-please-wait-screen

wordpress 获取外部链接内容

function fetch_body_url($fetch_link){
$response = wp_remote_get($fetch_link, array(‘timeout’ => 120));
return wp_remote_retrieve_body($response);
}

<?php
$url=’https://ficbook.net/authors/1000′; //random profile from requrested website
$agent = ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36’;
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_COOKIEJAR, ‘cookies.txt’);
curl_setopt($ch, CURLOPT_COOKIEFILE, ‘cookies.txt’);
curl_setopt($ch, CURLOPT_COOKIESESSION, true);

curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 120);
curl_setopt($ch, CURLOPT_TIMEOUT, 120);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_REFERER, ‘https://facebook.com/’);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
$response = curl_exec($ch);
curl_close($ch);
echo ‘<textarea>’.$response.'</textarea>’;
?>

—————————————

http://www.zjmainstay.cn/php-curl

CURLOPT_COOKIEJAR: 保存提交后反馈的cookie数据
CURLOPT_COOKIE: 直接使用字符串方式提交cookie参数
CURLOPT_COOKIEFILE: 使用文件方式提交cookie参数

确保您使用的是文件名的绝对路径(即 /var/dir/cookie.txt)而不是相对路径.

 

$cookie_file = tempnam(‘./temp’, ‘cookie’);
curl_setopt($ch,CURLOPT_COOKIEJAR,$cookie_file); //存储提交后得到的cookie数据
curl_setopt($ch,CURLOPT_COOKIEFILE,$cookie_file); //使用提交后得到的cookie数据做参数

// 从header中解析COOKIE
preg_match(“/set\-cookie:([^\r\n]*)/i”, $header, $matches);
$cookie = $matches[1];
// 后面用CURL提交的时候可以直接使用
// curl_setopt($ch, CURLOPT_COOKIE, $cookie);

在这种情况下,curl 会将您定义的 cookie 与文件中的 cookie 一起发送.
# sending manually set cookie
curl_setopt($ch, CURLOPT_HTTPHEADER, array(“Cookie: test=cookie”));

# sending cookies from file
curl_setopt($ch, CURLOPT_COOKIEFILE, $ckfile);

启用详细模式 curl_setopt($ch, CURLOPT_VERBOSE, true);

var_dump( curl_setopt($ch, CURLOPT_COOKIESESSION, 1) ); //returns false ????
var_dump( curl_setopt($ch, CURLOPT_COOKIEJAR, ‘/tmp/cookie’) ); //returns false ????
var_dump( curl_setopt($ch, CURLOPT_COOKIEFILE, ‘/tmp/cookie’) ); //returns false ????

CURLOPT_COOKIESESSION
启用时curl会仅仅传递一个session cookie,忽略其他的cookie,默认状况下cURL会将所有的cookie返回给服务端。
session cookie是指那些用来判断服务器端的session是否有效而存在的cookie。

CURLOPT_NOPROGRESS
启用时关闭curl传输的进度条,此项的默认设置为启用。
Note:
PHP自动地设置这个选项为TRUE,这个选项仅仅应当在以调试为目的时被改变。

 

版权声明:本文为原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://www.cnblogs.com/yisuo/p/16136189.html