日志查找好帮手-find和grep

一般要看日志的时候绝大部分情况下都是分析bug😳 如何快速定位查找到问题所在的文件和位置是我们首要的目的。因为Linux不像Windows系统有很好的图形界面，我们主要依赖一些命令进行查找日志。

今天主要想记录的主要是find和grep这两个命令，可以参考下面的文章学习下这两个命令：

看完上面的参考文章我们基本上对着两个命令已经熟悉的差不多了

其中find命令有几个比较特别在意的参数：

atime：访问时间（access time），指的是文件最后被读取的时间，可以使用touch命令更改为当前时间ctime：变更时间（change time），指的是文件本身最后被变更的时间，变更动作可以使chmod、chgrp、mv等等

mtime：修改时间（modify time），指的是文件内容最后被修改的时间，修改动作可以使echo重定向、vi等等

-mtime -n +n 按照文件的更改时间来查找文件， - n表示文件更改时间距现在n天以内，+ n表示文件更改时间距现在n天以前

-type 查找某一类型的文件，诸如: f - 普通文件

其中grep命令有几个比较特别在意的参数：

*: 表示当前目录所有文件，也可以是某个文件名
-r 是递归查找
-n 是显示行号
-R 查找所有文件包含子目录
-i 忽略大小写

一个实际的例子，我们拷贝一下Tornado源码中web模块的源码到一个文件中，然后进行查找操作。

 /tmp vim python-doc.txt                                                                  
 /tmp more python-doc.txt                                                                 
#
# Copyright 2009 Facebook
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.

"""``tornado.web`` provides a simple web framework with asynchronous
features that allow it to scale to large numbers of open connections,
making it ideal for `long polling

我们查找当前目录下最近两天修改的文件类型为普通文件且包含tornado字符串的文件，以及显示所在行数。

 /tmp  find ./  -type f -mtime -2 | xargs grep -n "tornado" 
 
.//python-doc.txt:16:"""``tornado.web`` provides a simple web framework with asynchronous
.//python-doc.txt:25:    import tornado.ioloop
.//python-doc.txt:26:    import tornado.web
.//python-doc.txt:28:    class MainHandler(tornado.web.RequestHandler):
.//python-doc.txt:33:        application = tornado.web.Application([
.//python-doc.txt:37:        tornado.ioloop.IOLoop.current().start()

使用awk输出想要的列或者进行求和

awk的默认分隔符是空格，可以使用F参数来指定新的分隔符

1	awk -F "." '{print $1}'

使用awk对符合的列进行求和操作

1	awk '{sum += $1};END {print sum}' test.txt

awk去重以某列重复的行

[root@localhost cc]# cat 2.txt
adc 3 5
a d a
a 3 adf
a d b
a 3 adf

去重第一列重复的行：

1
2
3

[root@localhost cc]# cat 2.txt |awk '!a[$1]++{print}'
adc 3 5
a d a

重复的行取最上面一行记录

去重以第一列和第二列重复的行：

[root@localhost cc]# cat 2.txt |awk '!a[$1" "$2]++{print}'
adc 3 5
a d a
a 3 adf

去除重复的行：

[root@localhost cc]# cat 2.txt |awk '!a[$0]++{print}'
adc 3 5
a d a
a 3 adf
a d b

只显示重复行：

1 2	[root@localhost cc]# cat 2.txt \|awk 'a[$0]++{print}' a 3 adf

参考文章：

Linux系统下查找最近修改过的文件