是什么导致了“urlopen错误[Errno 13] 权限被拒绝”错误？

Question

13 浏览2023年6月11日

匿名的 2023年6月11日

0 Comments

我正在尝试在Centos7服务器上编写一个Python（版本2.7.5）的CGI脚本。

我的脚本试图从librivox的网页上下载数据，例如... https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/，但我的脚本在出现以下错误时停止运行：

:  
      args = (error(13, 'Permission denied'),) 
      errno = None 
      filename = None 
      message = '' 
      reason = error(13, 'Permission denied') 
      strerror = None

我已经关闭了iptables，我可以像这样执行命令`wget -O- https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/'`，没有错误。这是错误发生的代码片段：

def output_html ( url, appname, doobb ):
        print "url is %s" % url
        soup = BeautifulSoup(urllib2.urlopen( url ).read())

更新：谢谢Paul和alecxe，我已经更新了我的代码如下：

def output_html ( url, appname, doobb ):
        #hdr = {'User-Agent':'Mozilla/5.0'}
        #print "url is %s" % url
        #req = url2lib2.Request(url, headers=hdr)
        # soup = BeautifulSoup(urllib2.urlopen( url ).read())
        headers = {'User-Agent':'Mozilla/5.0'}
        # headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36'}
        response = requests.get( url, headers=headers)
        soup = BeautifulSoup(response.content)

... 当调用...

response = requests.get( url, headers=headers)

...时，我得到了一个稍微不同的错误...

: ('Connection aborted.', error(13, 'Permission denied')) 
      args = (ProtocolError('Connection aborted.', error(13, 'Permission denied')),) 
      errno = None 
      filename = None 
      message = ProtocolError('Connection aborted.', error(13, 'Permission denied')) 
      request =  
      response = None 
      strerror = None

...有趣的是，我写了一个命令行版本的脚本，它运行良好，代码类似于这样...

def output_html ( url ):
        soup = BeautifulSoup(urllib2.urlopen( url ).read())

非常奇怪，你认为呢？

更新：

这个问题可能已经在这里找到答案：

urllib2.HTTPError: HTTP Error 403: Forbidden 2 answers

不，它们没有回答问题

0

3 答案

匿名的 · Answer 1 · 2023-06-30T15:48:44+00:00

SELinux is a security feature in CentOS 7 that can block certain actions performed by Python scripts if they are not explicitly allowed. This can result in a "urlopen error [Errno 13] Permission denied" error when using urllib, urllib2, or requests modules in a .py file.

To resolve this issue, the following steps can be taken:

1. Check the /var/log/audit/audit.log file for any entries related to Python. This can be done using the following command:

grep python /var/log/audit/audit.log

2. Use the audit2allow tool to generate a SELinux policy module based on the entries found in the audit log. This can be done with the following command:

audit2allow -M mypol

3. Install the generated SELinux policy module using the semodule command. This will allow the blocked Python calls to be executed without permission denied errors. Use the following command:

semodule -i mypol.pp

By following these steps, the SELinux restrictions that were causing the "urlopen error [Errno 13] Permission denied" errors should be resolved. This solution was found after troubleshooting and realizing that SELinux was the cause of the issue. The error messages provided by Python were not helpful in identifying the root cause, but using the grep and audit2allow commands helped to diagnose and resolve the problem.

Thank you to the person who shared this solution, as it helped others who were experiencing similar issues.

匿名的 · Answer 2 · 2023-07-07T23:15:38+00:00

这个问题是由于SELinux的限制导致的。解决方法是通过更改SELinux的布尔值来允许HTTPD脚本和模块连接网络。

具体的解决方法是执行以下命令：

setsebool httpd_can_network_connect on

该命令将修改SELinux的布尔值，允许HTTPD服务的脚本和模块连接网络。

更多关于这个SELinux布尔值的信息可以在CentOS的维基页面中找到：

httpd_can_network_connect (HTTPD Service)：允许HTTPD脚本和模块连接网络。

匿名的 · Answer 3 · 2023-08-31T00:13:20+00:00

这个问题的出现原因是权限被拒绝，解决方法是使用

requests

库并提供

User-Agent

头信息。以下是解决方法的代码示例：

from bs4 import BeautifulSoup
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36'}
response = requests.get("https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/", headers=headers)
soup = BeautifulSoup(response.content)
print(soup.title.text)  # "prints LibriVox"

感谢您的回答，但这只是给我提供了另一个错误13的版本。