是什么导致了“urlopen错误[Errno 13] 权限被拒绝”错误?
是什么导致了“urlopen错误[Errno 13] 权限被拒绝”错误?
我正在尝试在Centos7服务器上编写一个Python(版本2.7.5)的CGI脚本。
我的脚本试图从librivox的网页上下载数据,例如... https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/
,但我的脚本在出现以下错误时停止运行:
: args = (error(13, 'Permission denied'),) errno = None filename = None message = '' reason = error(13, 'Permission denied') strerror = None
我已经关闭了iptables
,我可以像这样执行命令`wget -O- https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/'`,没有错误。 这是错误发生的代码片段:
def output_html ( url, appname, doobb ): print "url is %s" % url soup = BeautifulSoup(urllib2.urlopen( url ).read())
更新:谢谢Paul和alecxe,我已经更新了我的代码如下:
def output_html ( url, appname, doobb ): #hdr = {'User-Agent':'Mozilla/5.0'} #print "url is %s" % url #req = url2lib2.Request(url, headers=hdr) # soup = BeautifulSoup(urllib2.urlopen( url ).read()) headers = {'User-Agent':'Mozilla/5.0'} # headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36'} response = requests.get( url, headers=headers) soup = BeautifulSoup(response.content)
... 当调用...
response = requests.get( url, headers=headers)
...时,我得到了一个稍微不同的错误...
: ('Connection aborted.', error(13, 'Permission denied')) args = (ProtocolError('Connection aborted.', error(13, 'Permission denied')),) errno = None filename = None message = ProtocolError('Connection aborted.', error(13, 'Permission denied')) request = response = None strerror = None
...有趣的是,我写了一个命令行版本的脚本,它运行良好,代码类似于这样...
def output_html ( url ): soup = BeautifulSoup(urllib2.urlopen( url ).read())
非常奇怪,你认为呢?
更新:
这个问题可能已经在这里找到答案:
urllib2.HTTPError: HTTP Error 403: Forbidden 2 answers
不,它们没有回答问题
SELinux is a security feature in CentOS 7 that can block certain actions performed by Python scripts if they are not explicitly allowed. This can result in a "urlopen error [Errno 13] Permission denied" error when using urllib, urllib2, or requests modules in a .py file.
To resolve this issue, the following steps can be taken:
1. Check the /var/log/audit/audit.log file for any entries related to Python. This can be done using the following command:
grep python /var/log/audit/audit.log
2. Use the audit2allow tool to generate a SELinux policy module based on the entries found in the audit log. This can be done with the following command:
audit2allow -M mypol
3. Install the generated SELinux policy module using the semodule command. This will allow the blocked Python calls to be executed without permission denied errors. Use the following command:
semodule -i mypol.pp
By following these steps, the SELinux restrictions that were causing the "urlopen error [Errno 13] Permission denied" errors should be resolved. This solution was found after troubleshooting and realizing that SELinux was the cause of the issue. The error messages provided by Python were not helpful in identifying the root cause, but using the grep and audit2allow commands helped to diagnose and resolve the problem.
Thank you to the person who shared this solution, as it helped others who were experiencing similar issues.
这个问题的出现原因是权限被拒绝,解决方法是使用
requests
库并提供
User-Agent
头信息。以下是解决方法的代码示例:
from bs4 import BeautifulSoup import requests headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36'} response = requests.get("https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/", headers=headers) soup = BeautifulSoup(response.content) print(soup.title.text) # "prints LibriVox"
感谢您的回答,但这只是给我提供了另一个错误13的版本。