Skip to content

Architecture: types

Edsko de Vries edited this page Aug 22, 2013 · 4 revisions

May be out of date

HackageModule and HackageFeature

All that the server needs to know about a feature is in the HackageModule type:

data HackageModule = HackageModule {
    featureName   :: String,
    resources     :: [Resource],
    dumpBackup    :: Maybe (BlobStorage -> IO [BackupEntry]),
    restoreBackup :: Maybe (BlobStorage -> RestoreBackup)
}

The name should be an alphabetical string, preferrably just one word, which shouldn't conflict with other features' names.

The [Resource] field is the most important one: it defines which pages the feature will serve. If you make a Resource at /animal/:type/:name, then visiting http://website.com/animal/monkey/alexander will ask that Resource what the best response is.

The two backup fields, which are both optional, can be used to export and import a human-readable representation of a feature's data. This is not used for persistent state (since happstack-state is), only for periodic snapshots.

Features will probably want to provide additional information other than what's available in the HackageModule field. For that reason, the actual feature object can be any data structure whatsoever, so long as you can get a HackageModule from it. In other words, it has to implement the HackageFeature typeclass:

class HackageFeature a where
    getFeature :: a -> HackageModule
    initHooks  :: a -> [IO ()]

(The initHooks are a list of miscellaneous actions to do on start up. This includes running hooks, initializing caches, forking maintenance threads, and so on. The default implementation of this is const [].)

Each feature provides an initialization function for its own feature object. This function usually just takes the global server config and possibly other feature objects as its argument. It's up to the top-level server code to pass the initialization function the required parameters. Once all of the feature objects have been initialized, all of their initHooks are called. Then, getFeature is called on all of the feature objects, yielding a list of HackageModules. Depending on the startup mode, this list can be used to import a package tarball, export one, or start serving web pages.

Server

data Server = Server {
  serverTxControl :: MVar TxControl,
  serverFeatures  :: [HackageModule],
  serverPort      :: Int,
  serverConfig    :: Config
}

This is the top-level server type, and it's created in preperation of calling simpleHTTP. The happstack-state control is created before initializing the features. If there comes a time when transaction controls can be split up by handle instead of requiring a global IORef, maybe this could be split off into features which 'own' each data component. The [HackageModule] field is created from hackageFeatures :: Config -> IO [HackageModule] in Distribution.Server.Features. The rest are from config options on command line.

Config

data Config = Config {
    serverStore     :: BlobStorage,
    serverStaticDir :: FilePath,
    serverURI       :: URI.URIAuth
}

Come straight from command line options. All features are passed this structure when they are initialized.

PackageIndex

PackageIndex operations are defined in Distribution.Server.PackageIndex. It should be imported qualified.

The central PackageIndex PkgInfo, from PackagesState in Distribution.Server.Packages.State, is essentially a Map PackageName [PkgInfo] which obeys certain invariants (no duplicate versions, no non-empty lists, etc.). This type comes straight from the cabal-install codebase. Although presently the only querying function is GetPackagesState, the modification functions are InsertPkgIfAbsent and MergePkg; call them with their wrappers from the core feature to activate the package index-syncing hooks.

PkgInfo

PkgInfo comes from Distribution.Server.Packages.Types. It is the primary type of a package. It takes up some space, I think, which could be reduced by an option to parse cabal files on the fly.

-- | The information we keep about a particular version of a package.
-- 
-- Previous versions of this package name and version may exist as well.
-- We normally disallow re-uploading but may make occasional exceptions.
data PkgInfo = PkgInfo {
    pkgInfoId :: !PackageIdentifier,
    -- | The information held in a parsed .cabal file (used by cabal-install)
    pkgDesc   :: !GenericPackageDescription,
    -- | The .cabal file text.
    pkgData   :: !ByteString,
    -- | The actual package .tar.gz file. It is optional for making an incomplete
    -- mirror, e.g. using archives of just the latest packages, or perhaps for a
    -- multipart upload process.
    -- The canonical tarball URL points to the most recently uploaded package.
    pkgTarball :: ![(BlobId, UploadInfo)],
    -- | Previous data. The UploadInfo does *not* indicate when the ByteString was
    -- uploaded, but rather when it was replaced. This way, pkgUploadData won't change
    -- even if a cabal file is changed.
    -- Should be updated whenever a tarball is uploaded (see mergePkg state function)
    pkgDataOld :: ![(ByteString, UploadInfo)],
    -- | When the package was created. Imports will override this with time in their logs.
    pkgUploadData :: !UploadInfo
}

type UploadInfo = (UTCTime, UserId)

Users

This is the central user database in Hackage, in Distribution.Server.Users.* modules, particularly Types, Users, and State. Every user has a unique UserId (a wrapper around an Int).

data Users = Users {
    -- | A map from UserId to UserInfo
    userIdMap   :: !(IntMap UserInfo),
    -- | A map from active UserNames to the UserId for that name
    userNameMap :: !(Map UserName UserId),
    -- | A map from a UserName to all UserIds which ever used that name
    totalNameMap :: !(Map UserName IntSet),
    -- | The next available UserId
    nextId :: !UserId
  }

newtype UserId = UserId Int
newtype UserName = UserName String

data UserInfo = UserInfo {
    userName   :: UserName,
    userStatus :: UserStatus
  }
data UserStatus = Deleted
                | Historical
                | Active !AccountEnabled UserAuth
data AccountEnabled = Enabled | Disabled

UserAuth

-- Distribution.Server.Users.Types
data UserAuth = UserAuth PasswdHash AuthType

-- Distribution.Server.Auth.Types
newtype PasswdPlain = PasswdPlain String
newtype PasswdHash  = PasswdHash  String
data AuthType = BasicAuth | DigestAuth

Authorization in Hackage gives a choice on authentication type: a password with a basic (crypt) hash can only be used for basic authentication, or any other sort of authentication that requires the browser to send the username and password in plain text.

Digest auth allows for the use of the more secure HTTP digest authorization, but it cannot be obtained from crypted information. It can also be used for basic authentication. It's preferred over the BasicAuth format because of its flexibility.