最佳方法查找文件中是否存在一个字符串
在使用Java 8且文件大小庞大的情况下,最佳的方法是利用Streams API来判断一个字符串是否存在于文件中。有两种情况:一种是当你找到包含要搜索的字符串stringToSearch
的行时,你想要返回该行;另一种是你想要遍历所有行,寻找stringToSearch
。示例代码如下:
String fileName = "c://SomeFile.txt"; String stringToSearch = "dummy"; try (Stream<String> stream = Files.lines(Paths.get(fileName))) { // 查找第一个匹配的行 Optional<String> lineHavingTarget = stream.filter(l -> l.contains(stringToSearch)).findFirst(); // 遍历所有匹配的行 stream.filter(l -> l.contains(stringToSearch)).forEach(System.out::println); // 进行其他操作 } catch (IOException e) { // 记录异常 }
因此,读取文件的所有行似乎不是一个好主意。最好逐行读取。如果你对最快的字符串搜索算法感兴趣,请查看此链接。
在处理包含大量行的文件时,最好逐行读取文件,而不是将整个内容读入程序内存。因此,基本上,读取一行,检查字符串是否存在,然后继续处理下一行。
To implement this, we can use the `open()` function in Python to open the file and read it line by line. We can then use the `in` operator to check if the desired string is present in each line.
def is_string_present(file_path, search_string): with open(file_path, 'r') as file: for line in file: if search_string in line: return True return False
This function takes the file path and the search string as input parameters. It opens the file using the `open()` function in read mode ('r'). Then, it iterates over each line in the file and checks if the search string is present in that line using the `in` operator. If it finds a match, it returns True. If it reaches the end of the file without finding a match, it returns False.
This approach is more memory-efficient as it reads the file line by line instead of loading the entire content into memory. It is also time-efficient as it stops as soon as it finds a match, instead of checking the entire file.
In conclusion, when dealing with large files, it is advisable to read the file line by line and check for the presence of a specific string using the `in` operator. This approach is more optimal in terms of memory usage and time efficiency.
问题:如何在文件中查找字符串的最佳方法?
在处理文件时,将文件的每一行存储在一个列表中没有太多的好处。尽管您提出的两种方法都存在相同的问题。
如果您只关心文件中的特定行,您可能不希望在内存中保存不需要的行。如果您使用的是Java 8,可以使用Files.lines()
以流的方式逐行读取文件。否则,guava的LineProcessor
也可以做到这一点。
下面的示例使用流来查找所有与字符串匹配的行,并将它们存储在一个列表中。
Listlines = Files.lines(path) // findFirst()可以用来获取第一个匹配并停止。 .filter(line -> line.contains("foo")) .collect(Collectors.toList());
下面的示例使用guava来实现相同的功能。
import com.google.common.io.Files; import com.google.common.io.LineProcessor; Listlines = Files.readLines(file, new LineProcessor >() { private List
lines = new ArrayList<>(); public boolean processLine(String line) throws IOException { if (line.contains("foo")) lines.add(line); return true; // 返回false以停止 } public List getResult() { return lines; } });