Sql Schema

Developer
Aug 4, 2008 at 10:34 PM
Hi

Are you including a SQL schema for your data?
Why roll your own object model rather than use the classes mentioned here: http://msdn.microsoft.com/en-us/library/bb412166.aspx
Coordinator
Aug 9, 2008 at 1:17 PM
I have yet to start work on the SQL schema and the SqlAtomPubProvider.  I'm delaying until the code base matures.  However, I hope to have something by the end of this month as I'm put something together for a presentation I'm giving on DLinq.

When I decided to roll my own object model, there were many considerations I made (which I plan to detail in a future blog post).  During the earlier design phases you can see that I started with the built-in Syndication classes (see changeset 21027). I also started using the other new AtomPub classes in the .NET 3.5 SP1 beta.  I just wasn't happy with their model and the hard decision was made to abandon it (changeset 21161). However, since they are mostly compatible, I'm leaving the option to switch back to the built in model on the table.
Developer
Sep 10, 2008 at 12:18 PM
The start of implementation and testing against the IAtomPubRepository has raised the first issue with the schema.
AppService is "the container for introspection information associated with one or more workspaces" and in this implementation there can only only be one AppService instance. Will Atom ever allow more than one?
So this means that the table AtomAppService does not need a sequential primary key.
This table could be changed to a set of SettingName SettingValue pairs, but I think that it's fine as one row.
Default data in this one row should probably be supplied on schema creation, since the code probably assumes that the AppSettings object returned by the IAtomPubRepository  is not null.

The AtomAppServiceWorkspace which makes a many-to-many link to AtomWorkspace is not needed at all - Even if there was more than one AppService, this would then be a AppServiceId on the AppWorkspace.

Do you agree?
Coordinator
Sep 11, 2008 at 4:27 AM
It is true that this implementation only supports one service document so that each instance of BlogSvc requires one service document. However, that service doc may be filtered depending on who you are and what you are authorized to see.  For example, admins can see the whole document but an author that is only on a single collection could only see that collection within the service doc.  This functionality still needs to be built but I hope it can become part of the next release.

Yes, a service doc must exist so I think we need to come up with a way to create a default service doc. 

I do agree with not needing a many to many link.  Also, I'm starting to think (at least for v1) that we don't need to store the service doc information in the database anyway.  The benefits:
  • all data in database (no service.config)
  • would support scaling to 1000's of workspaces and collections
aren't really a high priority. It would be a great start with just storing entries in the database.

Thoughts?


Developer
Sep 11, 2008 at 10:12 PM
I'm tinking that since the AtomAppService table exists, it may be worth keeping it. What would a default record contain?
I will make it a one-to-many with AtomWorkspace.

Filtering on authorization is an interesting direction to go for a later release.
Developer
Sep 14, 2008 at 5:10 PM
Another suggestion- have you considered breaking up the interface IAtomPubRepository into three parts
- IAtomPubMediaRepository for the *Media methods,
- IAtomPubEntryRepository for the *Entry methods
- the rest on an interface that could be called something like IAtomPubmetaDataRepository. You could break this down into Service and Categories if you wanted to.

This would separate responsibilities as those parts don't seem to have much overlap in the code. It would make implementing pieces, and mixing and matching easier.

Anthony

Coordinator
Sep 14, 2008 at 7:07 PM
Edited Sep 15, 2008 at 2:37 AM
Yes, I started off with something like:
  • IAppServiceRepository
  • IAtomEntryRepository
  • IMediaRepository
but there were dependencies from one to the other so I combined them to simplify things.  However, by updating to IoC we could refactor them into separate parts again.  It makes complete sense to me that users would likely want to put AppService document and media on the filesystem but keep all entries in a database.
Developer
Sep 15, 2008 at 9:01 AM
Storing some things in the Db and some on file system makes sense.
Are we in a position to do this refactoring now?
I have made some progress with SQL and mock repositories, but to go further now I'll have to read up on ATOM, and get some more unit tests going.
Coordinator
Sep 15, 2008 at 8:24 PM
I think we are in a good time to do refactoring and I don't think this change will be that massive.

Lets go with:

  • IAppServiceRepository
  • IAtomEntryRepository
  • IMediaRepository
Also, I want to move all that etag stuff into a separate method called GetEtag to simplify the API.

Finally, I'm pretty sure there will be the following dependencies

IAppServiceRepository -> none
IAtomEntryRepository -> IAppServiceRepository
IMediaRepository -> IAppServiceRepository and IAtomEntryRepository

Note, I'm ignoring external category support, we can add this back in later.
Developer
Sep 15, 2008 at 8:45 PM
Ok. I am happier working with the 3 interfaces rather than one on the SQL and mock code.

In the meantime I had an idea - it doesn't have to be one or the other. I have made AtomPubRepositoryFacade that adapts the 3 interfaces to IAtomPubRepository. This class can be a temporary measure if the three interfaces are the destination.
Developer
Sep 15, 2008 at 9:24 PM
Edited Sep 15, 2008 at 9:44 PM
Ok - better understand some of the design decisions thanks to this thread. I was impressed with your AtomPub object model and XmlBase class, which I guess gives the project more flexibility.

I also think that one service document per application domain sounds right... but there could in theory be multiple workspaces right? Again not close enough to understand for sure yet - but if you were to assign workspaces to organisations - you could end up with a multi-organsation blog/cms app - which is also pretty interesting (and the point at which you would probably move the service document into the database).

Are you thinking about Structuremap as the IoC/DI container?
Developer
Sep 15, 2008 at 9:39 PM
I have checked in stubs for tests on the three LINQ to sql repository objects, and for the mock repositories.
Ideally the same tests would be run against Sql, Mock and file repositories. Right now the tests are cut and pasted, any thoughts on if there's a better way to do it?
Coordinator
Sep 16, 2008 at 4:43 PM
Edited Sep 16, 2008 at 8:00 PM
I did a blog post about multiple workspaces and collections:
http://blogsvc.net/blog/2008/09/15/MultipleBlogsViaCollectionsAndWorkspaces.xhtml


(and the point at which you would probably move the service document into the database).

For convenience, sites with less than a 10 or 20 workspaces could use a service.config file for managing the service.  A site like http://geekswithblogs.net would need to use a database.  Actually, during the early design phase I always kept a site like geekswithblogs.net in consideration.  This is why the repository accepts a workspace and collection (rather than just collection id) to many of the entry methods.  By supplying a null workspace and a null collection to these methods, you could get entries for all workspaces or all collections in a single workspace.

Right now the tests are cut and pasted, any thoughts on if there's a better way to do it?
How bout if we first set the repository (maybe on a static property), then just call into a common test method?
Developer
Sep 16, 2008 at 6:38 PM
(multiple workspaces and collections) - just looking at your post... this looks great Jarrett. What's more with both Xml and Db providers all options are covered. I've looked at Subtext several times (the blogging engine for geekswithblogs) - have always liked the possibility of running multiple blogs under a single app instance - but they're missing an Xml provider. I currently have a technical blog, personal blog and photogallery - that's three instances... being able to combine them into a single app but with individual themes under a lightweight storage provider has been a goal of mine for a while now. The only options to date have been Subtext, or multiple instance of a blog like dasBlog, or BlogEngine.Net.  And hence why I think BlogSvc.Net is a project that's got legs...
Coordinator
Sep 16, 2008 at 7:48 PM
Note, the current design relies on the GetService repository method to be extremely efficient because the data in the service document is used by the authorization service and almost every method in the atom pub service  This is why it is cached by the file repository and why it should be cached by the sql repository with a database dependency.

One of the things I looked at was changing the GetService method to require a workspace parameter so the repository could narrow the amount of data returned. Granted, the current design with fixed or external categories could support hundreds of blogs without using too much memory.


Coordinator
Sep 19, 2008 at 5:26 AM
Finished a refactor session to separate the responsibilities of the file repository.  I also separated external categories from app service repo.  Now we can use any combination of Sql or File storage for the different types of data.  I haven't unit tested all the refactored code yet, so be warned. Oh yea, I also temporarily removed the WCF service because I'm tired and didn't feel like getting it to compile
Developer
Sep 21, 2008 at 3:41 PM
I have reached the stage in test of testing that admins and workspaces saved on the appService can be loaded again. of course this does not yet work on the SQl repository.

Some classes - e.g. AppService, AtomPerson, AppWorkspace do not have an Id field. Which poses the question - which field(s) can be used an a unique identifier to retrieve or update them on the database.

One way to do this would be, since all these objects have int ids on the database, to put a "public in DatabaseId" on XmlBase, and populate it whever loading from the database. Otherwise, can we identify key fields in each case?

Considerations would be - is it an issue if this id comes through in the atom xml? And will this work if the database is used for some, not all of the repositories.

Coordinator
Sep 22, 2008 at 4:44 PM
IMO, all three make a person unique.  But adding an id extension element (or attribute) to the xml is a perfect example of what makes the built in extensibility of Atom so great.  I'm using extensions on the Service Doc for the FileRepository for paths and StoreDepth settings.  I used a separate namespace as only the FileRepository needs to know about these values.  Also, AtomPub states that these values should be preserved by the clients.

example:

<entry xmlns="http://www.w3.org/2005/Atom" xmlns:sql="http://blogsvc.net/2008/Sql">
  <title>Comment</title>
  <author sql:id="1">
    <name>Jarrett</name>
    <email>jarrettv@gmail.com</email>
    <uri>http://jvance.com</uri>
  </author>
  <content type="html">...</content>
  <updated>2008-08-28T22:21:32.283462-06:00</updated>
 <id>tag:blogsvc.net,2008:info,Features,Comment</id>
</entry>

Then you can either access it directly from the Xml via the helper methods in the base class, or you could add a strongly typed property on an object SqlAtomEntry that inherits AtomEntry.
Developer
Sep 22, 2008 at 8:34 PM
Edited Sep 22, 2008 at 10:07 PM
Excellent. I have started off with a SqlAppService.
I have also decided that the mock repository will pretend to be a Sql repository.
Feel free to rearrange things if need be.
.. I'm not finished yet at all, but I think this is the way forward. Now I can tell if a AtomPerson is on the database yet or not.
More to come later this week.
Developer
Sep 24, 2008 at 9:34 PM
Edited Sep 25, 2008 at 9:10 AM
I am using a strongly typed property on an Subclass. Casting back and forth is proving to be an annoyance. I might try an extension method instead to provide this property when objects are in the repository.

Update: it may not be possible to use an extension method. I have asked the question here:

http://stackoverflow.com/questions/132245/is-there-any-way-to-use-an-extension-method-in-an-object-initializer-block-in-c
Developer
Sep 24, 2008 at 10:45 PM
Still getting up to speed with the codebase. Trying to absorb a lot in 24 hours.... including all of 21 episodes of Rob Conery's StoreFront series. If I hadn't read LINQ in Action a while back I'd be totally lost  [as opposed to only partially lost] :-)

Anthony - in the SqlRepository Provider - you've created an EntryFilters class... and if I've understood how this works.. it gives the provider a set of extension methods with which it can provide additional filters against AtomEntries in the AtomDataClassesDataContext.

I was wondering if the EntryFilters filter class (or in fact a set of filter classes) couldn't be promoted to a first class citizen of the repository API?  Again if I've understood this correctly, to do this the IAtomEntryRepository would have to be updated to return  IQueryable<AtomEntry> instead of  IEnumerable<AtomEntry> and the various GetEntries methods of IAtomEntryRepository would be reduced to a single GetEntries method. Filtering would then be implemented by the extension methods of the filter class, which could be shared by all of the providers.

Ignore me if I'm miles off...

Again - I'm not 100% sure here but another advantage of using IQueryable vs IEnumerable - is that the entire custom Atom object model could possibly be exposed by an ASP.Net Data Service? Again not sure here... so feel free to ignore me...




Developer
Sep 25, 2008 at 8:17 AM
Edited Sep 25, 2008 at 8:18 AM
I watched a lot of Rob Connery's excellent series, and I am familiar with the idea. The questions here are
1) Does this work well for the file repository - which is the only one that actaully works at present
2) Will "lazy loading" still work given that filtering would have to happen after converting the Repository types into Domain types? If filtering after constructing the domain objects results in inefficient SQL, then it's a no go.

I'm not a LINQ guru so I don't have all the answers.

Other than that, it would simply the entry repository interface and make the filtering code common between any existing or potential repositories, which would be a benefit.


Developer
Sep 25, 2008 at 8:45 AM
Hi Anthony - actually it was the post on your blog about Rob Connery's series that sent me over there :-)

1) I think it will work... the query expressions currently used in the FileAtomEntryRepository just need to end in .AsQueryable() in order to change the return type (allowing the use of the IQueryable filter extension methods)
2) I guess for lazy loading to work - the project would also need to implement the LazyList class shown in the Storefront project - so that the query expression is only evaluated when the items are enumerated...

But this is a totally new approach. It was cool to see it in action on the Storefront project, but it's new territory for sure...
Coordinator
Sep 26, 2008 at 7:37 PM
Edited Sep 26, 2008 at 7:41 PM
The "Filters" method seen in the StoreFront could work for both File and Sql repositories.  However, it would place restrictions on which storage technologies you could use.  Besides Linq to Objects (which the File repository utilizes) and Linq to Sql, your likely going to find it doesn't work.  For example, see this blog post about this exact problem in Linq to Entities.

The "Filters" method is very very tempting though. I love how elegant it is.

Side note: the xml based objects heavily use Linq to Xml as a translation layer between POCO and Xml

Another side note: the File Repository will be much slower (than Sql) for blogs with 1000's of entries as it must do all the filtering in memory.  This situation could be improved for Dated collections and/or with the addition of some sort of indexing.
Developer
Sep 27, 2008 at 12:53 AM
Edited Sep 27, 2008 at 2:17 AM
Ouch... the Linq to Entities (EF) issue sort of blows a hole in my (limited) understanding of Linq then...  .Great that you  discovered this now. Probably best to stick with the full interface definitions for now so that other providers can be supported...even good old Entlib Data Access Block, or even better... an NHibernate provider.