如何优化用于着色日志文件的脚本
如何优化用于着色日志文件的脚本
以下脚本会为日志文件中的SQL命令上色,并对其他一些标签进行处理:\n
red="\x1b[31m" green="\x1b[32m" yellow="\x1b[33m" blue="\x1b[34m" white="\x1b[37m" BLACK="\x1b[30;1m" RED="\x1b[31;1m" GREEN="\x1b[32;1m" YELLOW="\x1b[33;1m" BLUE="\x1b[34;1m" CYAN="\x1b[36;1m" WHITE="\x1b[37;1m" onred="\x1b[41m" lblack="\x1b[90m" lred="\x1b[91m" lgreen="\x1b[92m" lyellow="\x1b[93m" lblue="\x1b[94m" lmagenta="\x1b[95m" lcyan="\x1b[96m" lwhite="\x1b[97m" reset_color="\x1b[0m" sed -r "s/'[^']*'/${CYAN}&${reset_color}/g; s/[a-z_]*_id/${white}&${reset_color}/g; s/(.*\[)(AbstractApplicationContext)(\].*)/${BLACK}\\1${reset_color}${yellow}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(ActionService)(\].*)/${BLACK}\\1${reset_color}${lyellow}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(Authenticated)(\].*)/${BLACK}\\1${reset_color}${lyellow}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(CascadeHandlerImpl)(\].*)/${BLACK}\\1${reset_color}${lred}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(ClasspathHacker)(\].*)/${BLACK}\\1${reset_color}${lred}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(ConfigManagerImpl|ConfigManagerLoader)(\].*)/${BLACK}\\1${reset_color}${lred}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(ContextFilter|ContextImpl|ContextLoader)(\].*)/${BLACK}\\1${reset_color}${lred}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(DataSourceRestrictionConverter)(\].*)/${BLACK}\\1${reset_color}${lgreen}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(DatabaseLoader)(\].*)/${BLACK}\\1${reset_color}${lgreen}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(DefaultListableBeanFactory)(\].*)/${BLACK}\\1${reset_color}${lgreen}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(DispatchFilter)(\].*)/${BLACK}\\1${reset_color}${lgreen}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(FileHelper|FileIndex)(\].*)/${BLACK}\\1${reset_color}${lyellow}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(LicenseManagerImpl)(\].*)/${BLACK}\\1${reset_color}${lblue}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(LocalizedStringsLoader)(\].*)/${BLACK}\\1${reset_color}${lblue}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(LoggingPropertyPlaceholderConfigurer)(\].*)/${BLACK}\\1${reset_color}${lblue}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(PooledDbDriverImpl)(\].*)/${BLACK}\\1${reset_color}${lcyan}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(ProjectLoader)(\].*)/${BLACK}\\1${reset_color}${lcyan}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(PropertiesLoaderSupport)(\].*)/${BLACK}\\1${reset_color}${lcyan}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(RenderingFilter)(\].*)/${BLACK}\\1${reset_color}${lwhite}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(SecurityControllerImpl|SecurityServiceImpl)(\].*)/${BLACK}\\1${reset_color}${lblue}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(WorkflowRulesContainerImpl|WorkflowRulesContainerLoader)(\].*)/${BLACK}\\1${reset_color}${lred}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(XmlBeanDefinitionReader)(\].*)/${BLACK}\\1${reset_color}${lgreen}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(SELECT|select)(.*)((FROM|from) ([^ )]*))([^\]]*)(WHERE|where)?/${yellow}\\1${reset_color}\\2${yellow}\\4${reset_color} ${YELLOW}\\5${reset_color}\\6${yellow}\\7${reset_color}/g; s/LEFT OUTER JOIN [^ ]* ON/${yellow}&${reset_color}/g; s/ORDER BY/${yellow}&${reset_color}/g; s/(ASC|DESC)/${yellow}&${reset_color}/g; s/GROUP BY/${yellow}&${reset_color}/g; s/(INSERT INTO) ([^ ]*)(.*)(VALUES)/${green}\\1${reset_color} ${GREEN}\\2${reset_color}\\3${green}\\4${reset_color}/g; s/(INSERT INTO) ([^ ]*)(.*)/${green}\\1${reset_color} ${GREEN}\\2${reset_color}\\3${reset_color}/g; s/(UPDATE) *([^ ]*) (SET|set)/${blue}\\1${reset_color} ${BLUE}\\2${reset_color} ${blue}\\3${reset_color}/g; s/DELETE FROM *[^ ]* WHERE/${RED}&${reset_color}/g; s/\*\*\*ROLLBACK\*\*\*/${white}${onred}&${reset_color}/g; s/\[ERROR\]/${WHITE}${onred}&${reset_color}/g; s/SQLServerException:/${WHITE}${onred}&${reset_color}/g"
\n然而,当在一个1.9 MB的日志文件上运行时,执行时间长达1分41秒,而简单的cat命令只需不到1秒钟。\n如果我删除\"AbstractApplicationContext\"和\"XmlBeanDefinitionReader\"之间的代码块,结果相同(在1秒内完成)。\n我不明白为什么这个特定的代码块会造成如此大的差异!?\n是否有一种优化此类上色脚本的方法?\n示例文件摘录(用于复制以使其成为一个大文件):\n
[INFO ][2016-05-20 16:17:51,346][ContextLoader] - [Root WebApplicationContext: initialization started] [INFO ][2016-05-20 16:17:51,505][XmlBeanDefinitionReader] - [Loading XML bean definitions from ServletContext resource [/WEB-INF/config/context/appContext.xml]] [INFO ][2016-05-20 16:17:52,986][PropertiesLoaderSupport] - [Loading properties file from class path resource [config/mail.properties]] [INFO ][2016-05-20 16:17:55,900][ConfigManagerLoader] - [Reading XML config] [INFO ][2016-05-20 16:17:55,991][ConfigManagerLoader] - [Reading XML config: OK] [WARN ][2016-05-20 16:17:56,384][ConfigManagerLoader] - [Low max memory=477102080. Java max memory=1000 MB is recommended for production use, as a minimum.] [INFO ][2016-05-20 16:17:58,309][LocalizedStringsLoader] - [Loading localized strings for locale=[fr_FR]] [INFO ][2016-05-20 16:17:58,337][LocalizedStringsLoader] - [Loading localized strings for locale=[fr_FR]: OK, strings:759] [INFO ][2016-05-20 16:17:58,641][LocalizedStringsLoader] - [Loading localized strings for locale=[fr_FR]] [INFO ][2016-05-20 16:17:58,768][LocalizedStringsLoader] - [Loading localized strings for locale=[fr_FR]: OK, strings:46436] [INFO ][2016-05-20 16:17:58,830][LocalizedStringsLoader] - [Loading localized strings for locale=[nl_NL]] [INFO ][2016-05-20 16:17:58,946][LocalizedStringsLoader] - [Loading localized strings for locale=[nl_NL]: OK, strings:46436] [INFO ][2016-05-20 16:17:59,434][PropertiesLoaderSupport] - [Loading properties file from class path resource [config/mail.properties]] [INFO ][2016-05-20 16:18:00,476][XmlBeanDefinitionReader] - [Loading XML bean definitions from class path resource [project-child-context.xml]] [DEBUG][2016-05-20 16:18:01,259][DbConnectionImpl] - [SET CONCAT_NULL_YIELDS_NULL OFF] [DEBUG][2016-05-20 16:18:01,340][DbConnectionImpl] - [Updated: -1 records] [INFO ][2016-05-20 16:18:01,363][DatabaseImpl] - [Database loaded: data] [INFO ][2016-05-20 16:18:01,379][DatabaseLoader] - [Loading Database [data]: OK] [INFO ][2016-05-20 16:18:01,393][DatabaseLoader] - [Loading Database [schema]] [DEBUG][2016-05-20 16:18:01,865][DbConnectionImpl] - [SELECT column FROM table WHERE table_name = 'sample'] [DEBUG][2016-05-20 16:18:01,894][DbConnectionImpl] - [SET CONCAT_NULL_YIELDS_NULL OFF] [DEBUG][2016-05-20 16:18:01,898][DbConnectionImpl] - [Updated: -1 records] [INFO ][2016-05-20 16:18:06,241][WorkflowRulesContainerLoader] - [Loading Workflow Rule, ruleId=[checkRequestDuplicates]] [INFO ][2016-05-20 16:18:06,384][WorkflowRulesContainerLoader] - [Loading Workflow Rule, ruleId=[getStatistic]] [INFO ][2016-05-20 16:18:06,971][WorkflowRulesContainerLoader] - [Loading Workflow Rule, ruleId=[saveRecord]] [INFO ][2016-05-20 16:18:07,126][WorkflowRulesContainerLoader] - [Loading Workflow Rule, ruleId=[EmailService]] [INFO ][2016-05-20 16:18:07,542][WorkflowRulesContainerLoader] - [Loading Workflow Rule, ruleId=[LocalizationRead]] [INFO ][2016-05-20 16:18:09,578][FileIndex$1] - [File index loading started] [DEBUG][2016-05-20 16:18:19,406][DbConnectionImpl] - [SET CONCAT_NULL_YIELDS_NULL OFF] [DEBUG][2016-05-20 16:18:19,410][DbConnectionImpl] - [Updated: -1 records] [INFO ][2016-05-20 16:18:22,201][LicenseManagerImpl] - [Checkout concurrent license=[Main]] [DEBUG][2016-05-20 16:18:22,209][SecurityControllerImpl] - [Determine next request] [DEBUG][2016-05-20 16:18:22,239][RenderingFilter] - [Rendering mode] [DEBUG][2016-05-20 16:18:22,253][Authenticated] - [User is authenticated] [DEBUG][2016-05-20 16:18:22,306][FileHelper] - [Find file=[hovertip.js]] [DEBUG][2016-05-20 16:18:22,399][FileHelper] - [Find file=[hovertip.css]] [DEBUG][2016-05-20 16:22:18,263][DbConnectionImpl] - [INSERT INTO notifications_log (columns, columns) VALUES ('2016-05-20', 'ERROR')] [DEBUG][2016-05-20 16:22:18,334][DbConnectionImpl] - [Updated: 1 records] [DEBUG][2016-05-20 16:22:18,393][DbConnectionImpl] - [***COMMIT***] [DEBUG][2016-05-20 16:22:18,549][DbConnectionImpl] - [***ROLLBACK***] [DEBUG][2016-05-20 16:23:37,659][DbConnectionImpl] - [SET CONCAT_NULL_YIELDS_NULL OFF] [DEBUG][2016-05-20 16:23:37,662][DbConnectionImpl] - [Updated: -1 records] [DEBUG][2016-05-20 16:23:37,886][DataSourceImpl] - [SELECT col_id FROM table_1 LEFT OUTER JOIN table_2 ON table_1.col_id=table_2.col_id] [DEBUG][2016-05-20 16:23:37,926][DbConnectionImpl] - [***COMMIT***] [DEBUG][2016-05-20 16:23:37,930][ContextFilter] - [---------- Request: processing finished] [DEBUG][2016-05-20 16:38:38,033][DbConnectionImpl] - [UPDATE users SET pwd = NULL WHERE user_name = 'me'] [DEBUG][2016-05-20 16:38:38,051][DbConnectionImpl] - [Updated: 1 records] [DEBUG][2016-05-20 16:38:38,058][DbConnectionImpl] - [SET CONCAT_NULL_YIELDS_NULL OFF] [DEBUG][2016-05-20 16:38:38,063][DbConnectionImpl] - [Updated: -1 records] [DEBUG][2016-05-20 17:43:25,087][DbConnectionImpl] - [***COMMIT***] [DEBUG][2016-05-20 17:43:25,096][ContextFilter] - [---------- Request: processing finished]
如何优化脚本以进行日志文件的着色
原因:
- 在脚本中有多个相似的行,每行都进行颜色处理,导致效率较低。
解决方法:
- 将相似的行合并为一行,以提高性能。
代码示例:
s/(.*\[)(ActionService)(\].*)/${BLACK}\\1${reset_color}${lyellow}\\2${reset_color}${BLACK}\\3${reset_color}/g; s/(.*\[)(Authenticated)(\].*)/${BLACK}\\1${reset_color}${lyellow}\\2${reset_color}${BLACK}\\3${reset_color}/g; and so forth
通过合并为一行来进行优化:
s/(.*\[)(ActionService|Authenticated)(\].*)/${BLACK}\\1${reset_color}${yellow}\\2${reset_color}${BLACK}\\3${reset_color}/g;
这样可以提高性能。
如何优化用于给日志文件上色的脚本
问题的出现原因:
当前的脚本对于每一行输入都会与37个正则表达式进行比较,所以耗时非常长。
解决方法:
可以使用awk,只需进行一次测试即可。通过以下代码实现:
BEGIN {
red = "<red>" # "\x1b[31m"
green = "<green>" # "\x1b[32m"
yellow = "<yellow>" # "\x1b[33m"
black = "<black>" # "\x1b[30;1m"
reset = "<reset>" # "\x1b[0m"
color["ContextLoader"] = red
color["XmlBeanDefinitionReader"] = green
color["PropertiesLoaderSupport"] = yellow
}
match($0,/((\[[^]]+\]){2}\[)([^]]+)(.*)/,a) {
print black a[1] reset color[a[3]] a[3] reset black a[4] reset
}
awk -f tst.awk file
上述代码将日志文件中感兴趣的部分字符串隔离出来,然后通过查找哈希表中的颜色来打印。该代码使用GNU awk的match()函数的第三个参数,对其他版本的awk进行简单修改即可。
为了处理SQL语句,可以仿照sed脚本中的逻辑来更新上述代码:
s/(SELECT|select)(.*)((FROM|from) ([^ )]*))([^\]]*)(WHERE|where)?/${yellow}\\1${reset_color}\\2${yellow}\\4${reset_color} ${YELLOW}\\5${reset_color}\\6${yellow}\\7${reset_color}/g;
s/LEFT OUTER JOIN [^ ]* ON/${yellow}&${reset_color}/g;
可以将以上代码添加到现有的awk脚本中,如下所示:
....
match($0,/((\[[^]]+\]){2}\[)([^]]+)(.*)/,a) {
print black a[1] reset color[a[3]] a[3] reset black a[4] reset
next
}
match($0,/(SELECT|select)(.*)((FROM|from) ([^ )]*))([^\]]*)(WHERE|where)?/,a) {
print yellow a[1] reset a[2] yellow a[4] reset yellow a[5] reset a[6] yellow reset
next
}
match($0,/LEFT OUTER JOIN [^ ]* ON/,a) {
print yellow a[0] reset
next
}
....
请注意,每个块的末尾都有一个"next"语句,告诉awk停止处理当前输入行,并返回到它的隐式的"while read line"循环的起点。我们使用它来提高效率,避免awk在成功匹配了一次正则表达式后继续分析该行。
如果要在awk中完成全部操作,就不需要使用sed。当需要对单行进行简单替换时,可以使用sed,而对于更复杂的操作则使用awk,不会同时使用两者。如果你在示例输入中包含了一些SQL语句,我会向你展示如何处理,但你可能可以自己解决(使用单独的match() { action }块)。如果你需要awk的参考资料,可以阅读Arnold Robbins的《Effective Awk Programming, 4th Edition》(其他awk书籍已经过时)。
我在回答底部添加了一些处理SQL语句的代码。如你所见,根据你现有的sed脚本,这是非常明显的。
感谢你改进示例。不过,由于你的"next"命令,它实际上并没有正常工作。我需要为SQL关键字和以"_id"结尾的SQL列名上色。这不能在一次处理中完成,因为"INSERT INTO"后面可能跟着"SELECT","SELECT"中可能包含一个、两个甚至多个"JOIN",可以在任何地方指定多个"_id"列,还可以有子查询等等。我已经创建了一个包含所有复杂情况的样本文件。我希望脚本的运行时间不超过正常"cat"命令的20%,这样我就可以在实时日志中使用它,通过"tail -f"命令进行管道传输。
请注意,我最初的尝试是使用awk为整行上色,就像你做的那样,然后使用sed为行的部分内容上色。这两个脚本分别命名为"colorlog"和"colorsql"。但奇怪的是,当它们在"tail -f"命令之后以管道方式依次调用时,根本没有输出!当它们单独在"tail -f"之后调用时,它们都正常工作。所以,在将它们一起进行管道传输时存在一种互动,这完全超出了我的理解... 有什么建议吗?
然后,删除相关的SQL部分中的"next"语句,并将"print"改为"$0 = ",如果还不能正常工作,请发布一个关于如何处理SQL的新问题,并提供简洁、可测试的样本输入和预期输出。你的所有日志和SQL都在一个文件中吗?看起来你需要两个不同的工具来解析两种不同的格式。关于"tail | awk | sed"的问题,可能是类似于stackoverflow.com/questions/5427483/…中所描述的缓冲问题。
已在stackoverflow.com/questions/37513885上发布了新问题。谢谢!