是什么导致了“urlopen错误[Errno 13] 权限被拒绝”错误?
是什么导致了“urlopen错误[Errno 13] 权限被拒绝”错误?
我的脚本试图从librivox的网页上下载数据,例如... https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/
: args = (error(13, 'Permission denied'),) errno = None filename = None message = '' reason = error(13, 'Permission denied') strerror = None
,我可以像这样执行命令`wget -O- https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/'`,没有错误。 这是错误发生的代码片段:
def output_html ( url, appname, doobb ): print "url is %s" % url soup = BeautifulSoup(urllib2.urlopen( url ).read())
def output_html ( url, appname, doobb ): #hdr = {'User-Agent':'Mozilla/5.0'} #print "url is %s" % url #req = url2lib2.Request(url, headers=hdr) # soup = BeautifulSoup(urllib2.urlopen( url ).read()) headers = {'User-Agent':'Mozilla/5.0'} # headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36'} response = requests.get( url, headers=headers) soup = BeautifulSoup(response.content)
... 当调用...
response = requests.get( url, headers=headers)
: ('Connection aborted.', error(13, 'Permission denied')) args = (ProtocolError('Connection aborted.', error(13, 'Permission denied')),) errno = None filename = None message = ProtocolError('Connection aborted.', error(13, 'Permission denied')) request = response = None strerror = None
def output_html ( url ): soup = BeautifulSoup(urllib2.urlopen( url ).read())
urllib2.HTTPError: HTTP Error 403: Forbidden 2 answers
SELinux is a security feature in CentOS 7 that can block certain actions performed by Python scripts if they are not explicitly allowed. This can result in a "urlopen error [Errno 13] Permission denied" error when using urllib, urllib2, or requests modules in a .py file.
To resolve this issue, the following steps can be taken:
1. Check the /var/log/audit/audit.log file for any entries related to Python. This can be done using the following command:
grep python /var/log/audit/audit.log
2. Use the audit2allow tool to generate a SELinux policy module based on the entries found in the audit log. This can be done with the following command:
audit2allow -M mypol
3. Install the generated SELinux policy module using the semodule command. This will allow the blocked Python calls to be executed without permission denied errors. Use the following command:
semodule -i mypol.pp
By following these steps, the SELinux restrictions that were causing the "urlopen error [Errno 13] Permission denied" errors should be resolved. This solution was found after troubleshooting and realizing that SELinux was the cause of the issue. The error messages provided by Python were not helpful in identifying the root cause, but using the grep and audit2allow commands helped to diagnose and resolve the problem.
Thank you to the person who shared this solution, as it helped others who were experiencing similar issues.
from bs4 import BeautifulSoup import requests headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36'} response = requests.get("https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/", headers=headers) soup = BeautifulSoup(response.content) print(soup.title.text) # "prints LibriVox"