· net f

F#: A day of writing a little twitter application

I spent most of the bank holiday Monday here in Sydney writing a little application to scan through my twitter feed and find me just the tweets which have links in them since for me that’s where a lot of the value of twitter lies.

I’m sure someone has done this already but it seemed like a good opportunity to try and put a little of the F# that I’ve learned from reading Real World Functional Programming to use. The code I’ve written so far is at the end of this post.

What did I learn?

  • I didn’t really want to write a wrapper on top of the twitter API so I put out a request for suggestions for a .NET twitter API. It pretty much seemed to be a choice of either Yedda or tweetsharp and since the latter seemed easier to use I went with that. In the code you see at the end I have added the 'Before' method to the API because I needed it for what I wanted to do.

  • I found it really difficult writing the 'findLinks' method - the way I’ve written it at the moment uses pattern matching and recursion which isn’t something I’ve spent a lot of time doing. Whenever I tried to think how to solve the problem my mind just wouldn’t move away from the procedural approach of going down the collection, setting a flag depending on whether we had a 'lastId' or not and so on. Eventually I explained the problem to Alex and working together through it we realised that there are three paths that the code can take:

    1. When we have processed all the tweets and want to exit

    2. The first call to get tweets when we don’t have a 'lastId' starting point - I was able to get 20 tweets at a time through the API

    3. Subsequent calls to get tweets when we have a 'lastId' from which we want to work backwards from

    I think it is probably possible to reduce the code in this function to follow just one path by passing in the function to find the tweets but I haven’t been able to get this working yet.

  • I recently watched a F# video from Alt.NET Seattle featuring Amanda Laucher where she spoke of the need to explicitly state types that we import from C# into our F# code. You can see that I needed to do that in my code when referencing the TwitterStatus class - I guess it would be pretty difficult for the use of that class to be inferred but it still made the code a bit more clunky than any of the other simple problems I’ve played with before.

  • I’ve not used any of the functions on 'Seq' until today - from what I understand these are available for applying operations to any collections which implement IEnumerable - which is exactly what I had!

  • I had to use the following code to allow F# interactive to recognise the Dimebrain namespace: ~text #r "\path\to\Dimebrain.Tweetsharp.dll" ~ I thought it would be enough to reference it in my Visual Studio project and reference the namespace but apparently not. </ul>

    == The code

    This is the code I have at the moment - there are certainly some areas that it can be improved but I’m not exactly sure how to do it. In particular: What’s the best way to structure F# code? I haven’t seen any resources on how to do this so it’d be cool if someone could point me in the right direction. The code I’ve written is just a collection of functions which doesn’t really have any structure at all. Reducing duplication - I hate the fact I’ve basically got the same code twice in the 'getStatusesBefore' and 'getLatestStatuses' functions - I wasn’t sure of the best way to refactor that. Maybe putting the common code up to the 'OnFriendsTimeline' call into a common function and then call that from the other two functions? I think a similar approach can be applied to findLinks as well. The code doesn’t feel that expressive to me - I was debating whether or not I should have passed a type into the 'findLinks' function - right now it’s only possible to tell what each part of the tuple means by reading the pattern matching code which feels wrong. I think there may also be some opportunities to use the function composition operator but I couldn’t quite see where. How much context should we put in the names of functions? Most of my programming has been in OO languages where whenever we have a method its context is defined by the object on which it resides. When naming functions such as 'findOldestStatus' and 'oldestStatusId' I wasn’t sure whether or not I was putting too much context into the function name. I took the alternative approach with the 'withLinks' function since I think it reads more clearly like that when it’s actually used.

    // Import required namespaces
    open Dimebrain.TweetSharp.Fluent
    open Dimebrain.TweetSharp.Extensions
    open Dimebrain.TweetSharp.Model
    open Microsoft.FSharp.Core.Operators
    
    // Define a function to get statuses before a given status ID
    let getStatusesBefore (statusId: int64) =
        FluentTwitter.CreateRequest()
            .AuthenticateAs("userName", "password")
            .Statuses()
            .OnFriendsTimeline()
            .Before(statusId)
            .AsJson()
            .Request()
            .AsStatuses()
    
    // Define a function to filter statuses containing links
    let withLinks (statuses: seq<TwitterStatus>) =
        statuses |> Seq.filter (fun eachStatus -> eachStatus.Text.Contains("http"))
    
    // Define a function to print statuses
    let print (statuses: seq<TwitterStatus>) =
        for status in statuses do
            printfn "[%s] %s" status.User.ScreenName status.Text
    
    // Define a function to get the latest statuses
    let getLatestStatuses =
        FluentTwitter.CreateRequest()
            .AuthenticateAs("userName", "password")
            .Statuses()
            .OnFriendsTimeline()
            .AsJson()
            .Request()
            .AsStatuses()
    
    // Define a function to find the oldest status
    let findOldestStatus (statuses: seq<TwitterStatus>) =
        statuses
        |> Seq.sort_by (fun eachStatus -> eachStatus.Id)
        |> Seq.head
    
    // Retrieve the oldest status ID from the latest statuses
    let oldestStatusId = (getLatestStatuses |> findOldestStatus).Id
    
    // Define a recursive function to find statuses with links
    let rec findLinks (args: int64 * int * int) =
        match args with
        | (_, numberProcessed, recordsToSearch) when numberProcessed >= recordsToSearch -> ignore
        | (0L, numberProcessed, recordsToSearch) ->
            let latestStatuses = getLatestStatuses
            (latestStatuses |> withLinks) |> print
            findLinks(findOldestStatus(latestStatuses).Id, numberProcessed + 20, recordsToSearch)
        | (lastId, numberProcessed, recordsToSearch) ->
            let latestStatuses = getStatusesBefore lastId
            (latestStatuses |> withLinks) |> print
            findLinks(findOldestStatus(latestStatuses).Id, numberProcessed + 20, recordsToSearch)
    
    // Define a function to initiate the search for statuses with links
    let findStatusesWithLinks recordsToSearch =
        findLinks(0L, 0, recordsToSearch) |> ignore

And to use it to find the links contained in the most recent 100 statuses of the people I follow:

findStatusesWithLinks 100;;

Any advice on how to improve this will be gratefully received. I’m going to continue working this into a little DSL which can print me up a nice summary of the links that have been posted during the times that I’m not on twitter watching what’s going on.

  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket