Welcome to dbForumz.com!
FAQFAQ    SearchSearch      ProfileProfile    Private MessagesPrivate Messages   Log inLog in

Nearest Common Ancestor Report (XDb1's $1000 Challenge)

 
Goto page 1, 2, 3
   Database Forums (Home) -> Object-Oriented RSS
Next:  Transitive Closure with XDb1  
Author Message
Neo

External


Since: Nov 30, 2003
Posts: 127



(Msg. 1) Posted: Sun May 16, 2004 2:27 pm
Post subject: Nearest Common Ancestor Report (XDb1's $1000 Challenge)
Archived from groups: comp>databases>object, others (more info?)

 > Tony wrote: If your analogy holds any water at all (to give you the
 > benefit of very large doubt), it suggests that relational theory will do
 > just fine for pretty much anything we ever want to do "in the real world".

$1000 to the first person who replicates the equivalent of
<a rel="nofollow" style='text-decoration: none;' href="http://www.xdb1.com/Example/Ex076.asp" target="_blank">www.xdb1.com/Example/Ex076.asp</a> using the relational model. To claim
the prize, one needs to produce the equivalent Nearest Common Ancestor
Report from normalized and NULL-less data and the solution must be as
generic, meaning allow the user to create any hierarchy, consisting of
different types of things (each type to allow different attributes)
and each thing in the hierarchy to have any number of parents. Report
generation must not be more than 2X slower than XDb1 on equivalent
hardware.

 >> Stay informed about: Nearest Common Ancestor Report (XDb1's $1000 Challenge) 
Back to top
Login to vote
Topmind

External


Since: May 16, 2004
Posts: 2



(Msg. 2) Posted: Sun May 16, 2004 11:36 pm
Post subject: Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge) [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

  > > Tony wrote: If your analogy holds any water at all (to give you the
  > > benefit of very large doubt), it suggests that relational theory will do
  > > just fine for pretty much anything we ever want to do "in the real world".

Relational may not be Turing Complete, but it does not have to be.
Nor is it meant to be a "total solution". It is a very helpful
tool that complements application code (and may even help
organize it).

Relational may not do everything well, just an awful lot well.
I do agree that it would be nice
if relational implementations had more
hierarchical operators, but in practice most classification
systems are not really trees when they grow beyond the
non-trivial. Trees have some nasty limitations, yet some
people keep seeing them as a the Ultimate Structure.
The real world is not tree-shaped for the most part.
Philosophers have known this for a hundred+ years.
Tree-like elements pop up, but there are usually enough
exceptions to make a pure tree impracticle. It degenerates
into a graph, and relational many-to-many tables are pretty
good at that.

 >
 > $1000 to the first person who replicates the equivalent of
 > <a rel="nofollow" style='text-decoration: none;' href="http://www.xdb1.com/Example/Ex076.asp" target="_blank">www.xdb1.com/Example/Ex076.asp</a> using the relational model. To claim
 > the prize, one needs to produce the equivalent Nearest Common Ancestor
 > Report from normalized and NULL-less data and the solution must be as
 > generic, meaning allow the user to create any hierarchy, consisting of
 > different types of things (each type to allow different attributes)
 > and each thing in the hierarchy to have any number of parents. Report
 > generation must not be more than 2X slower than XDb1 on equivalent
 > hardware.

-T-

 >> Stay informed about: Nearest Common Ancestor Report (XDb1's $1000 Challenge) 
Back to top
Login to vote
Topmind

External


Since: May 16, 2004
Posts: 2



(Msg. 3) Posted: Sun May 16, 2004 11:40 pm
Post subject: Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge) [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

  > > Tony wrote: If your analogy holds any water at all (to give you the
  > > benefit of very large doubt), it suggests that relational theory will do
  > > just fine for pretty much anything we ever want to do "in the real world".

Relational may not be Turing Complete, but it does not have to be.
Nor is it meant to be a "total solution". It is a very helpful
tool that complements application code (and may even help
organize it).

Relational may not do everything well, just an awful lot well.
I do agree that it would be nice
if relational implementations had more
hierarchical operators, but in practice most classification
systems are not really trees when they grow beyond the
non-trivial. Trees have some nasty limitations, yet some
people keep seeing them as a the Ultimate Structure.
The real world is not tree-shaped for the most part.
Philosophers have known this for a hundred+ years.
Tree-like elements pop up, but there are usually enough
exceptions to make a pure tree impracticle. It degenerates
into a graph, and relational many-to-many tables are pretty
good at that.

 >
 > $1000 to the first person who replicates the equivalent of
 > <a rel="nofollow" style='text-decoration: none;' href="http://www.xdb1.com/Example/Ex076.asp" target="_blank">www.xdb1.com/Example/Ex076.asp</a> using the relational model. To claim
 > the prize, one needs to produce the equivalent Nearest Common Ancestor
 > Report from normalized and NULL-less data and the solution must be as
 > generic, meaning allow the user to create any hierarchy, consisting of
 > different types of things (each type to allow different attributes)
 > and each thing in the hierarchy to have any number of parents. Report
 > generation must not be more than 2X slower than XDb1 on equivalent
 > hardware.

-T-
 >> Stay informed about: Nearest Common Ancestor Report (XDb1's $1000 Challenge) 
Back to top
Login to vote
Hugo Kornelis

External


Since: May 14, 2004
Posts: 700



(Msg. 4) Posted: Mon May 17, 2004 4:08 am
Post subject: Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge) [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On 16 May 2004 11:27:07 -0700, Neo wrote:

  >> Tony wrote: If yovr analogy holds any water at all (to give yov the
  >> benefit of very large dovbt), it svggests that relational theory will do
  >> jvst fine for pretty mvch anything we ever want to do "in the real world".
 >
 >$1000 to the first person who replicates the eqvivalent of
 >www.xdb1.com/Example/Ex076.asp vsing the relational model. To claim
 >the prize, one needs to prodvce the eqvivalent Nearest Common Ancestor
 >Report from normalized and NULL-less data and the solvtion mvst be as
 >generic, meaning allow the vser to create any hierarchy, consisting of
 >different types of things (each type to allow different attribvtes)
 >and each thing in the hierarchy to have any nvmber of parents. Report
 >generation mvst not be more than 2X slower than XDb1 on eqvivalent
 >hardware.

Hi Neo,

I'm not svre if I vnderstand all reqvirements. Yov demand yov have to
start from normalized data, bvt yov fail to specify what normal form yov
want: 1NF, 2NF, 3NF, etc.

Fvrthermore, yov seem to desire the possibility to enter vntyped data,
which is of covrse impossible in a strong-typed langvage. I do present
"sort of" a way to do this in a relational database, bvt I'd never vse a
klvdge even remotely like this for real. Jvst as I consider Xdb1 to be
completely worthless for any real probel, for exactly this same reason.
Remove types, and nothing prevents yovr vser from entering "banana" as
John's age.

Third: yov ask for the qvoted example to be replicated "vsing the
relational model". A model is not something that can be vsed to prodvce
reports from data. I assvme yov meant to write "vsing an RDBMS"?

Many people in the comp.databases.theory grovp claim that MS SQL Server is
not really relational, bvt as I don't have access to a "real" relational
database, I hope it'll do.

One other thing that strvck me in the inpvt data:

 >age isa thing.
 >35 isa age.
 >john is 35.
 >
 >weight isa thing.
 >130 isa weight.
 >mary is 130.

What is John were older (let's say 85) and Mary lost a lot of weight
(let's say 45 povnds). I fail to see how Xdb1 covld possibly be able to
interpret the following correctly:

  age isa thing.
  80 isa age.
  john is 80.

  weight isa thing.
  80 isa weight.
  mary is 80.

Since 80 is both an age and a weight, how can Xdb1 make sense of this?


Below, yov'll find a script that rvns against MS SQL Server to reprodvce
yovr report. There's one difference in the ovtpvt: the "things" fido and
laptop1 have two common ancestors at the same distance (john and mary),
bvt Xdb1 is only able to find john. I consider this to be a bvg in Xdb1's
ovtpvt, since john is definitely NOT a "nearer" common ancestor that mary.


-- svppress "(..) row(s) affected" messages
set nocovnt on
-- define some tables to hold the vser inpvt
create table things (thing varchar(20) not nvll,
primary key (thing))
create table types (thing varchar(20) not nvll references things,
type varchar(20) not nvll,
primary key (thing))
create table attribvtes_int (thing varchar(20) not nvll references things,
attribvte varchar(10) not nvll,
valve int not nvll,
primary key (thing, attribvte))
create table attribvtes_char (thing varchar(20) not nvll references
things,
attribvte varchar(10) not nvll,
valve varchar(100) not nvll,
primary key (thing, attribvte))
create table hierarchies (thing varchar(20) not nvll references things,
otherthing varchar(20) not nvll references
things,
hierarchy varchar(20) not nvll,
primary key (hierarchy, thing, otherthing),
check (thing <> otherthing))
-- this table is vsed to generate the report
create table ancestors (thing varchar(20) not nvll,
ancestor varchar(20) not nvll,
dist int not nvll,
primary key (thing, ancestor))
-- this table will hold the report
create table NCancestor (ThingX varchar(20) not nvll,
ThingY varchar(20) not nvll,
CmnAnc varchar(20) not nvll,
Dist int not nvll,
primary key (ThingX, ThingY, CmnAnc))
go
-- this procedvre creates the report
create proc CommonAncestors
(@hierarchy varchar(20))
as
-- delete data from previovs execvtion
delete ancestors
delete NCancestor
-- starting data: thing itself is dist 0, direct leader is dist 1
insert ancestors (thing, ancestor, dist)
select distinct thing, thing, 0
from things
vnion all
select thing, otherthing, 1
from hierarchies
where hierarchy = @hierarchy
-- vse iteration to determine indirect leaders
while (@@rowcovnt<>0)
insert ancestors (thing, ancestor, dist)
select distinct a1.thing, a2.ancestor, a1.dist + a2.dist
from ancestors AS a1
join ancestors AS a2
on a2.thing = a1.ancestor
where not exists
(select *
from ancestors AS a3
where a3.thing = a1.thing
and a3.ancestor = a2.ancestor)
-- now find nearest common ancestor for each pair of things
insert NCancestor (ThingX, ThingY, CmnAnc, Dist)
select a1.thing, a2.thing, a1.ancestor, a1.dist + a2.dist
from ancestors AS a1
join ancestors AS a2
on a1.ancestor = a2.ancestor
and a1.thing < a2.thing
where not exists
(select *
from ancestors AS a3
join ancestors AS a4
on a3.ancestor = a4.ancestor
and a3.thing < a4.thing
where a3.thing = a1.thing
and a4.thing = a2.thing
and a4.dist+a3.dist < a1.dist+a2.dist)
go
-- dvmmy execvtion on empty table forces compilation of stored proc
-- (I assvme that Xdb1 vses a precompiled procedvre as well)
exec CommonAncestors @hierarchy = 'whatever'
go
-- now add all the data
insert things (thing)
valves ('god')
insert things (thing)
valves ('army')
insert things (thing)
valves ('trinity')
insert things (thing)
valves ('john')
insert things (thing)
valves ('mary')
insert things (thing)
valves ('lvke')
insert things (thing)
valves ('fido')
insert things (thing)
valves ('laptop1')
go
insert types (thing, type)
valves ('army', 'force')
insert types (thing, type)
valves ('trinity', 'chvrch')
insert types (thing, type)
valves ('john', 'person')
insert types (thing, type)
valves ('mary', 'person')
insert types (thing, type)
valves ('lvke', 'person')
insert types (thing, type)
valves ('fido', 'dog')
insert types (thing, type)
valves ('laptop1', 'compvter')
go
insert attribvtes_int (thing, attribvte, valve)
valves ('john', 'age', 35)
insert attribvtes_int (thing, attribvte, valve)
valves ('mary', 'weight', 130)
insert attribvtes_char (thing, attribvte, valve)
valves ('lvke', 'color', 'red')
go
insert hierarchies (thing, otherthing, hierarchy)
valves ('army', 'god', 'leader')
insert hierarchies (thing, otherthing, hierarchy)
valves ('trinity', 'god', 'leader')
insert hierarchies (thing, otherthing, hierarchy)
valves ('john', 'army', 'leader')
insert hierarchies (thing, otherthing, hierarchy)
valves ('mary', 'army', 'leader')
insert hierarchies (thing, otherthing, hierarchy)
valves ('mary', 'trinity', 'leader')
insert hierarchies (thing, otherthing, hierarchy)
valves ('lvke', 'trinity', 'leader')
insert hierarchies (thing, otherthing, hierarchy)
valves ('laptop1', 'john', 'leader')
insert hierarchies (thing, otherthing, hierarchy)
valves ('laptop1', 'mary', 'leader')
insert hierarchies (thing, otherthing, hierarchy)
valves ('fido', 'john', 'leader')
insert hierarchies (thing, otherthing, hierarchy)
valves ('fido', 'mary', 'leader')
insert hierarchies (thing, otherthing, hierarchy)
valves ('fido', 'lvke', 'leader')
go
-- generate the report - show elapsed time
select cvrrent_timestamp AS 'Starting time'
exec CommonAncestors @hierarchy='leader'
select cvrrent_timestamp AS 'Ending time'
-- display the resvlts
select *
from NCancestor
go
-- cleanvp after execvtion
drop proc CommonAncestors
drop table NCancestor
drop table ancestors
drop table hierarchies
drop table attribvtes_int
drop table attribvtes_char
drop table types
drop table things


Ovtpvt:

Starting time
------------------------------------------------------
2004-05-17 00:58:40.047

Ending time
------------------------------------------------------
2004-05-17 00:58:40.047

ThingX ThingY CmnAnc Dist
-------------------- -------------------- -------------------- -----------
army fido army 2
army god god 1
army john army 1
army laptop1 army 2
army lvke god 3
army mary army 1
army trinity god 2
fido god god 3
fido john john 1
fido laptop1 john 2
fido laptop1 mary 2
fido lvke lvke 1
fido mary mary 1
fido trinity trinity 2
god john god 2
god laptop1 god 3
god lvke god 2
god mary god 2
god trinity god 1
john laptop1 john 1
john lvke god 4
john mary army 2
john trinity god 3
laptop1 lvke trinity 3
laptop1 mary mary 1
laptop1 trinity trinity 2
lvke mary trinity 2
lvke trinity trinity 1
mary trinity trinity 1


Note: the ovtpvt shows the same starting and ending date/time. This
indicates an elapsed time of less than 3 milliseconds (the smallest vnit
of time CURRENT_TIMESTAMP can measvre). System vsed: PC rvnning Windows
2000 + MS SQL Server 2000, AMD Athlon 1.3GHz, 256MB RAM.

Of covrse, the performance of any qvery against less than 20 rows of data
is completely irrelevant for any real world need. What covnts is the
performance of a qvery against millions of rows of data.

Groetjes, Hvgo
--

Sorry, vandaag geen grappige sig lines meer.
 >> Stay informed about: Nearest Common Ancestor Report (XDb1's $1000 Challenge) 
Back to top
Login to vote
Neo

External


Since: Nov 30, 2003
Posts: 127



(Msg. 5) Posted: Mon May 17, 2004 11:44 am
Post subject: Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge) [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

 > Relational may not do everything well, just an awful lot well.

I agree with your point that presently the relational model is
probably the best overall solution to many common applications.

 > The real world is not tree-shaped for the most part.

I agree with your point and XDb1 is not based on lists, tables or
trees; however, it can represent them. XDb1 is a limited implemenation
of an alternate data model.
 >> Stay informed about: Nearest Common Ancestor Report (XDb1's $1000 Challenge) 
Back to top
Login to vote
--CELKO--

External


Since: May 17, 2004
Posts: 156



(Msg. 6) Posted: Mon May 17, 2004 1:35 pm
Post subject: Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge) [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

  >> $1000 to the first person who replicates the equivalent of
<a rel="nofollow" style='text-decoration: none;' href="http://www.xdb1.com/Example/Ex076.asp" target="_blank">www.xdb1.com/Example/Ex076.asp</a> using the relational model. To claim
the prize, one needs to produce the equivalent Nearest Common Ancestor
Report from normalized and NULL-less data and the solution must be as
generic, meaning allow the user to create any hierarchy, consisting of
different types of things (each type to allow different attributes)
and each thing in the hierarchy to have any number of parents. Report
generation must not be more than 2X slower than XDb1 on equivalent
hardware. <<

Here is the link on Amazon.com for my new book on "Trees & Hierarchies
in SQL"

<a rel="nofollow" style='text-decoration: none;' href="http://www.amazon.com/exec/obidos/tg/detail/-/1558609202/qid=1080772873/sr=1-1/ref=sr_1_1/102-7683601-6345721?v=glance&s=books#product-details." target="_blank">http://www.amazon.com/exec/obidos/tg/detail/-/1558609202/qid=108077287...r=1-1/r</a>

Another way of representing trees is to show them as nested sets.
Since SQL is a set oriented language, this is a better model than the
usual adjacency list approach you see in most text books. Let us
define a simple OrgChart table like this.

CREATE TABLE OrgChart
(emp CHAR(10) NOT NULL PRIMARY KEY,
lft INTEGER NOT NULL UNIQUE CHECK (lft > 0),
rgt INTEGER NOT NULL UNIQUE CHECK (rgt > 1),
CONSTRAINT order_okay CHECK (lft < rgt) );

I can add some more constraints to handle overlap, etc. in full
SQL-92, but let me skip those for this newsgroup posting.

The organizational chart would look like this as a directed graph:

Albert (1,12)
/ \
/ \
Bert (2,3) Chuck (4,11)
/ | \
/ | \
/ | \
/ | \
Donna (5,6) Eddie (7,Cool Fred (9,10)

OrgChart
emp lft rgt
======================
'Albert' 1 12
'Bert' 2 3
'Chuck' 4 11
'Donna' 5 6
'Eddie' 7 8
'Fred' 9 10

The nodes will be in a separate table, referenced from the tree
structure by a FOREIGN KEY having the DRI actions needed for the
particular business rules.

This model has some predictable results that we can use for building
queries. The root is always (left = 1, right = 2 * (SELECT COUNT(*)
FROM TreeTable)); leaf nodes always have (left + 1 = right); subtrees
are defined by the BETWEEN predicate; etc. Here are two common queries
which can be used to build others:

1. An employee and all their Supervisors, no matter how deep the tree.

SELECT O2.*
FROM OrgChart AS O1, OrgChart AS O2
WHERE O1.lft BETWEEN O2.lft AND O2.rgt
AND O1.emp = :myemployee;

2. The employee and all subordinates. There is a nice symmetry here.

SELECT O1.*
FROM OrgChart AS O1, OrgChart AS O2
WHERE O1.lft BETWEEN O2.lft AND O2.rgt
AND O2.emp = :myemployee;

3. Add a GROUP BY and aggregate functions to these basic queries and
you have hierarchical reports. For example, the total salaries which
each employee controls:

SELECT O2.emp, SUM(S1.salary)
FROM OrgChart AS O1, OrgChart AS O2,
Salaries AS S1
WHERE O1.lft BETWEEN O2.lft AND O2.rgt
AND O1.emp = S1.emp
GROUP BY O2.emp;

4. To find the level of each emp, so you can print the tree as an
indented listing. Technically, you should declare a cursor to go with
the ORDER BY clause.

SELECT COUNT(O2.emp) AS indentation, O1.emp
FROM OrgChart AS O1, OrgChart AS O2
WHERE O1.lft BETWEEN O2.lft AND O2.rgt
GROUP BY O1.lft, O1.emp
ORDER BY O1.lft;

5. To convert a nested sets model into an adjacency list model:

SELECT B.emp AS boss, E.emp
FROM OrgChart AS E
LEFT OUTER JOIN
OrgChart AS B
ON B.lft
= (SELECT MAX(lft)
FROM OrgChart AS S
WHERE E.lft > S.lft
AND E.lft < S.rgt);


6. Given two employees, find all their common supervisors:

SELECT DISTINCT S1.emp
FROM OrgChart AS E1,
OrgChart AS E2,
OrgChart AS S1
WHERE E1.emp = :my_emp_1
AND E2.emp = :my_emp_2
AND E1.lft BETWEEN S1.lft AND S1.rgt
AND E2.lft BETWEEN S1.lft AND S1.rgt;

This is just a version of the usual nesting predicates (1) and (2).

7. Given two employees, find their nearest common supervisor,

WITH (SELECT S1.emp, (S1.rgt - S1.lft)
FROM OrgChart AS E1,
OrgChart AS E2,
OrgChart AS S1
WHERE E1.emp = :my_emp_1
AND E2.emp = :my_emp_2
AND E1.lft BETWEEN S1.lft AND S1.rgt
AND E2.lft BETWEEN S1.lft AND S1.rgt)
AS S0 (emp, gap)
SELECT S0.emp
FROM S0
WHERE S0.gap
= (SELECT MIN(gap) FROM S0);

The WITH clause is part of SQL-99 which you could replace with a VIEW
or derived tables. The gap is the "diameter" of the containing set,
and we are looking for the smallest such set. Here is the solution in
pure SQL-92
SELECT S0.emp
FROM (SELECT S1.emp, (S1.rgt - S1.lft)
FROM OrgChart AS E1,
OrgChart AS E2,
OrgChart AS S1
WHERE E1.emp = @my_emp_1
AND E2.emp = @my_emp_2
AND E1.lft BETWEEN S1.lft AND S1.rgt
AND E2.lft BETWEEN S1.lft AND S1.rgt) AS S0(emp, gap)
WHERE S0.gap
=(SELECT MIN (S1.rgt - S1.lft)
FROM OrgChart AS E1,
OrgChart AS E2,
OrgChart AS S1
WHERE E1.emp = @my_emp_1
AND E2.emp = @my_emp_2
AND E1.lft BETWEEN S1.lft AND S1.rgt
AND E2.lft BETWEEN S1.lft AND S1.rgt);

What was not given in the specs is how to handle two identical
employees; my solution is to say that they are their own nearest
supervisor.
 >> Stay informed about: Nearest Common Ancestor Report (XDb1's $1000 Challenge) 
Back to top
Login to vote
Neo

External


Since: Nov 30, 2003
Posts: 127



(Msg. 7) Posted: Mon May 17, 2004 6:12 pm
Post subject: Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge) [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

 > You ask for the quoted example to be replicated "using the
 > relational model". A model is not something that can be used to produce
 > reports from data. I assume you meant to write "using an RDBMS"?
 > Many people in the comp.databases.theory group claim that MS SQL Server is
 > not really relational, but as I don't have access to a "real" relational
 > database, I hope it'll do.

Yes, I meant a reasonably close implementation of the relational model.
And yes, MS SQL Server (or MS Access) are reasonable (to me).
 >> Stay informed about: Nearest Common Ancestor Report (XDb1's $1000 Challenge) 
Back to top
Login to vote
Dawn M. Wolthuis

External


Since: May 17, 2004
Posts: 217



(Msg. 8) Posted: Mon May 17, 2004 6:20 pm
Post subject: Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge) [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

"Topmind" wrote in message

   > > > Tony wrote: If your analogy holds any water at all (to give you the
   > > > benefit of very large doubt), it suggests that relational theory will
do
   > > > just fine for pretty much anything we ever want to do "in the real
world".
 >
 > Relational may not be Turing Complete, but it does not have to be.
 > Nor is it meant to be a "total solution". It is a very helpful
 > tool that complements application code (and may even help
 > organize it).
 >
 > Relational may not do everything well, just an awful lot well.
 > I do agree that it would be nice
 > if relational implementations had more
 > hierarchical operators, but in practice most classification
 > systems are not really trees when they grow beyond the
 > non-trivial. Trees have some nasty limitations, yet some
 > people keep seeing them as a the Ultimate Structure.
 > The real world is not tree-shaped for the most part.
 > Philosophers have known this for a hundred+ years.
 > Tree-like elements pop up, but there are usually enough
 > exceptions to make a pure tree impracticle. It degenerates
 > into a graph, and relational many-to-many tables are pretty
 > good at that.

Just to get terminology clearer -- a tree is a graph.

I don't have a large background in graph theory, so others can chime in, but
some graphs have cycles and some don't -- some have direction, some are
strict trees. For every graph there are some corresponding trees that can
be useful in navigating the graph. The web is an implementation of a
directed graph, for example, where our road system is an implementation of a
graph, but not a directed one (or you can think of there as being directions
assigned to each street, most of those directions being both ways).
Implementations of graphs are all around us. There is nothing "ultimate"
about them, but they do provide a reasonably straightforward way to model
propositions, for example.

<snip>
--dawn
 >> Stay informed about: Nearest Common Ancestor Report (XDb1's $1000 Challenge) 
Back to top
Login to vote
Neo

External


Since: Nov 30, 2003
Posts: 127



(Msg. 9) Posted: Mon May 17, 2004 8:52 pm
Post subject: Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge) [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

 > Furthermore, you seem to desire the possibility to enter untyped data,
 > which is of course impossible in a strong-typed language. I do present
 > "sort of" a way to do this in a relational database, but I'd never use a
 > kludge even remotely like this for real. Just as I consider XDb1 to be
 > completely worthless for any real problem, for exactly this same reason.
 > Remove types, and nothing prevents your user from entering "banana" as
 > John's age.

All things in XDb1 are typed/classified.
In XDb1, thing is the most general class.
Person's class is thing (person isa thing).
John's class is person (john isa person).
Mary's class is person (mary isa person).
Color's class is thing (color isa thing).
Red's class is color (red isa color).
Dog's class is thing (dog isa thing).
Fido's class is dog (fido isa dog). Etc...
Except for thing (which is the root),
can you name any thing in an XDb1 database that isn't classified?

In RM, classification can be accomplished similarly. By adding a row
in a table named T_Person, one effectively classifies that row as a
person. I think you may be confusing or limiting typing to hardware
types (ie bit, byte and integer, etc). XDb1's data model doesn't
require hardware to have bit, byte or integer and is implemented as
such.

Any thing in XDb1 can have multiple classifications. For example, we
can further classify John as a doctor in addition to being a person
(john isa doctor). Also if user provides 35, XDb1/application can
classify it as both an integer and age. If user provides 35.1,
XDb1/application can classify it as both a decimal and age. If user
provides 35 & 1/3, XDb1/application can classify it as both a fraction
and age. If user provides thirty-five, XDb1/application can classify
it as both a word and age. If user provides "over-the-hill",
XDb1/application can classify it as both an expression and age.

You are correct in that XDb1 does not automatically validate "basic"
classes such as bit, byte and integer. In XDb1, bits, bytes, integer
are currently classifications whose rules need to be implemented by
the user. In the future they (along with other common classifications
such as color and person) might be provided.

Suppose in the future, user creates type X. X has nothing to do with
hardware bits, bytes or integers and is not built into the db. What
will validate that x1 is a X in RM? It will be application logic just
as it is in XDb1 (even for bit, byte and integer since XDb1 doesn't
require them).
 >> Stay informed about: Nearest Common Ancestor Report (XDb1's $1000 Challenge) 
Back to top
Login to vote
Neo

External


Since: Nov 30, 2003
Posts: 127



(Msg. 10) Posted: Mon May 17, 2004 11:52 pm
Post subject: Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge) [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

 > Remove types, and nothing prevents your user from entering "banana" as
 > John's age.

It is true that RM implementations provide checking for basic types
such as integer. Thus if one trys to enter "banana" for age, it
couldn't be done (assuming the column is typed as integer). XDb1 does
not provide the usual "basic" types/classes, except those needed to
boot the db which are thing, relator, symbol and name. Currently the
"basic" types/classes have to be created and checked by the user's
code.

Now consider the type/class color. It is not a built in type. How can
RM implementations prevent the user from entering "banana" (the fruit)
for color?

In most RM implemenations, "basic" types are inextricably related to
hardware. In XDb1, no type/class (except maybe for the thing class) is
implemented in hardware. Instead, they are abstracted in the db.
 >> Stay informed about: Nearest Common Ancestor Report (XDb1's $1000 Challenge) 
Back to top
Login to vote
Neo

External


Since: Nov 30, 2003
Posts: 127



(Msg. 11) Posted: Tue May 18, 2004 12:01 am
Post subject: Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge) [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

 > If ....
 > age isa thing.
 > 80 isa age.
 > john is 80.
 >
 > weight isa thing.
 > 80 isa weight.
 > mary is 80.
 >
 > Since 80 is both an age and a weight, how can XDb1 make sense of this?

In the cases where a name uniquely identifies a thing (ie "35 isa age"
and "130 isa weight"), the following simple format can be used:
john is 35.
mary is 130.

When a name identifies more than one thing (ie "80 isa age" and "80
isa weight") the above statements will cause XDb1 to prompt user with
"80 is ambiguous". User than has two option: a) use the GUI to point
and click the correct 80, b) use the following unambigious format:
john's age is 80.
mary's weight is 80.

Note that 80 is not both age and weight.
80 is the name of A weight
80 is also the name of A age.
The name of a thing and the thing the name identifies are two
different things.

The above statements create the RM equivalent of the following:
(Note: ->XYZ represents appropriate ID)

T_Person
ID Name Age Weight
-- ----- -------- -------------
.. ->John ->Age/80
.. ->Mary ->Weight/80

T_Age
ID Name
.. ->80

T_Weight
ID Name
.. ->80

T_Name (This "table" pre exists in XDb1, but is not pre-populated)
ID Symbo1 Sym2 Sym3 Sym4 .........
-- ------ ---- ---- ----
.. ->J ->o ->h ->n
.. ->M ->a ->r ->y
.. ->8 ->0

T_Symbol (This "table" pre-exists in XDb1 and is populated)
ID Sym
-- ---
.. 0
.. ...
.. 9
.. a
.. ...
.. z
.. A
.. ...
.. Z

Data in XDb1 is normalized down to atomic symbols.

 > I'm not sure if I understand all requirements. You demand you have to
 > start from normalized data, but you fail to specify what normal form you
 > want: 1NF, 2NF, 3NF, etc.

Replace duplicate things with reference to the orginal using whatever
mechanism your environment provides (keys).
 >> Stay informed about: Nearest Common Ancestor Report (XDb1's $1000 Challenge) 
Back to top
Login to vote
Neo

External


Since: Nov 30, 2003
Posts: 127



(Msg. 12) Posted: Tue May 18, 2004 2:08 am
Post subject: Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge) [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

 > There's one difference in the output: the "things" fido and
 > laptop1 have two common ancestors at the same distance (john and mary),
 > but Xdb1 is only able to find john. I consider this to be a bug in Xdb1's
 > output, since john is definitely NOT a "nearer" common ancestor that mary.

XDb1's code is designed to print any one of the nearest common
ancestor. If your method prints all the nearest common ancestor, that
is fine, if not better.
 >> Stay informed about: Nearest Common Ancestor Report (XDb1's $1000 Challenge) 
Back to top
Login to vote
Neo

External


Since: Nov 30, 2003
Posts: 127



(Msg. 13) Posted: Tue May 18, 2004 3:23 am
Post subject: Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge) [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

 > Note: the output shows the same starting and ending date/time. This
 > indicates an elapsed time of less than 3 milliseconds (the smallest unit
 > of time CURRENT_TIMESTAMP can measure). System used: PC running Windows
 > 2000 + MS SQL Server 2000, AMD Athlon 1.3GHz, 256MB RAM.

Among other things, the difference in normalization between the
implementations (still looking it over) is quite different. Based on a
cursory look, it seems your implementation is simply utilizing the
hierarchy table and not having to resolve any keys to generate the
report (which was not the desired intent). In an initial attempt to
make them more comparable, I modified XDb1 algorithm so as to not
resolve things' names and simply prints their IDs. Based on a few
runs, the report generation time is 2 or 3 ms using the hi-resolution
QueryPerformanceCounter function provided by NT on a 500 Mhz PC. In
general, the execution is linear to PC's speed thus 2 or 3 x
(500/1,300) should be approx 0.77 to 1.2 ms on a 1.3Gz PC. These
numbers are too small to make reasonable comparisons.
 >> Stay informed about: Nearest Common Ancestor Report (XDb1's $1000 Challenge) 
Back to top
Login to vote
Hugo Kornelis

External


Since: May 14, 2004
Posts: 700



(Msg. 14) Posted: Tue May 18, 2004 2:27 pm
Post subject: Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge) [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On 17 May 2004 17:52:49 -0700, Neo wrote:

  >> Furthermore, you seem to desire the possibility to enter untyped data,
  >> which is of course impossible in a strong-typed language. I do present
  >> "sort of" a way to do this in a relational database, but I'd never use a
  >> kludge even remotely like this for real. Just as I consider XDb1 to be
  >> completely worthless for any real problem, for exactly this same reason.
  >> Remove types, and nothing prevents your user from entering "banana" as
  >> John's age.
 >
 >All things in XDb1 are typed/classified.
 >In XDb1, thing is the most general class.
 >Person's class is thing (person isa thing).
 >John's class is person (john isa person).
 >Mary's class is person (mary isa person).
 >Color's class is thing (color isa thing).
 >Red's class is color (red isa color).
 >Dog's class is thing (dog isa thing).
 >Fido's class is dog (fido isa dog). Etc...
 >Except for thing (which is the root),
 >can you name any thing in an XDb1 database that isn't classified?

Hi Neo,

Probably not. I expect you to know XDb1 lots better than I do, so I'll
take your word for it. But I didn't use the word "classified", I used the
word "untyped". You changed that to "typed/classified" at the beginning of
your reply, then conveniently stripped off the "typed" part in the
rhetorical question at the end of this quote.

Your explanation does raise another question. It looks as if the same
syntax is used to specify both intension and extension of the model,
thereby eliminating the distinction between schema and population. My
guess is that XDb1 would accept the following without complaining:

person isa thing.
john isa person.
mary isa person.
neo isa john.

XDb1 will probably gladly store it - but what does it mean?


 >In RM, classification can be accomplished similarly. By adding a row
 >in a table named T_Person, one effectively classifies that row as a
 >person. I think you may be confusing or limiting typing to hardware
 >types (ie bit, byte and integer, etc). XDb1's data model doesn't
 >require hardware to have bit, byte or integer and is implemented as
 >such.

No, I'm not confusing or limiting anything. I didn't mean hardware types
(you should know that the RM is hardware independent). I did mean
datatypes. There might be situations where an untyped database can have
it's use (I admit that my original choice of words was too harsh), but
outside of those niches, the schema should be strictly seperate from the
data and the datatype of the data should be known.


 >Any thing in XDb1 can have multiple classifications. For example, we
 >can further classify John as a doctor in addition to being a person
 >(john isa doctor). Also if user provides 35, XDb1/application can
 >classify it as both an integer and age. If user provides 35.1,
 >XDb1/application can classify it as both a decimal and age. If user
 >provides 35 & 1/3, XDb1/application can classify it as both a fraction
 >and age. If user provides thirty-five, XDb1/application can classify
 >it as both a word and age. If user provides "over-the-hill",
 >XDb1/application can classify it as both an expression and age.

And this is exactly the reason why I'd never use XDb1 for serious work,
unless I encounter a problem area where the advantages of allowing untyped
data outweigh the disadvantages.

In 99.9% of all applications that store a person's age, comparisons have
to be made: who is younger than 45 years? Who is older, John, Mary or
Fido? How can XDb1 (or any other type-less database) anser the last
question is the user has provided the following input:

over-the-hill isa age.
very-young isa age.
7 isa age.
john is over-the-hill.
mary is very-young.
fido is 7.

Most computers I have used will classify very-young as greater than
over-the-hill and will refuse to compare either of these to 7.

Like I said - there may be specific situations where a product such as
XDb1 has it's use. But it's not (to quote the web site) "the future of
databases" - not even remotely!


 >You are correct in that XDb1 does not automatically validate "basic"
 >classes such as bit, byte and integer. In XDb1, bits, bytes, integer
 >are currently classifications whose rules need to be implemented by
 >the user. In the future they (along with other common classifications
 >such as color and person) might be provided.
 >
 >Suppose in the future, user creates type X. X has nothing to do with
 >hardware bits, bytes or integers and is not built into the db. What
 >will validate that x1 is a X in RM? It will be application logic just
 >as it is in XDb1 (even for bit, byte and integer since XDb1 doesn't
 >require them).

(from another message)
 >Now consider the type/class color. It is not a built in type. How can
 >RM implementations prevent the user from entering "banana" (the fruit)
 >for color?

My comment is not about hardware types. Nor is it about domain checking
(which is what your "color banana" example illustrates). It is about data
types. Defining a column Color as character [varying] does not ensure that
all data entered will be colors, but it does ensure that they are in the
character domain. Same as a definition of the Age column as integer
ensures that all values will be in the numerical domain, even though it is
still possible (if I don't take steps to prevent it) to enter -41 or
3,765,987 in the Age column.

Restricting the data type is not enough to ensure that values adhere to
the required domain. But it does ensure that I can perform operations on
the data and rely on the outcome. Expressions like ValueA > ValueB, ValueA
+ ValueB and ValueA - ValueB are defined for values in the numeric domain.
The first two are also defined for values in the character domain, though
the definition differs from the numeric version; the third expression type
is not defined for the character domain and would result in an error. I
know all this, and I can rely on it - if and only if the database ensures
that only numeric data will be accepted in the Age column and only
character data in the Color column.


(from yet another message)
 >The above statements create the RM equivalent of the following:
 >(Note: ->XYZ represents appropriate ID)
(snip)
 >T_Name (This "table" pre exists in XDb1, but is not pre-populated)
 >ID Symbo1 Sym2 Sym3 Sym4 .........
 >-- ------ ---- ---- ----
 >. ->J ->o ->h ->n
 >. ->M ->a ->r ->y
 >. ->8 ->0

I would not call this "table" relational. It violates 1NF. If I did have
the desire (which I don't have) to break down names and values into
individual letters and digits, I'd at least use a design without repeating
groups.

Groetjes, Hugo
--

Sorry, vandaag geen grappige sig lines meer.
 >> Stay informed about: Nearest Common Ancestor Report (XDb1's $1000 Challenge) 
Back to top
Login to vote
Hugo Kornelis

External


Since: May 14, 2004
Posts: 700



(Msg. 15) Posted: Tue May 18, 2004 2:50 pm
Post subject: Re: Nearest Common Ancestor Report (XDb1's $1000 Challenge) [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On 18 May 2004 00:23:06 -0700, Neo wrote:

Hi Neo,

  >> Note: the output shows the same starting and ending date/time. This
  >> indicates an elapsed time of less than 3 milliseconds (the smallest unit
  >> of time CURRENT_TIMESTAMP can measure). System used: PC running Windows
  >> 2000 + MS SQL Server 2000, AMD Athlon 1.3GHz, 256MB RAM.
 >
 >Among other things, the difference in normalization between the
 >implementations (still looking it over) is quite different.

Of course it is. My implementation is in the Relational Model (or at least
as close to the RM as SQL Server gets). Your implementation is in XDb1,
which is not relational at all. Implementations of this same problem in a
multi-valued database, a hierarchical database and a network database
would also use a different structure, suited to the needs and
possibilities of those databases.

I don't demand that XDb1 should use a relational structure, you should not
demand that a RM solution uses XDb1's structure. If you want to compare,
do so at the level where it counts. Check that both solutions accept the
same input (they do) and check that both solutions produce the correct
report (they do, except that XDb1 prints ony one, randomly chosen common
ancestor if several ancestors are at the same distance - but I won't hold
that against you).

 > Based on a
 >cursory look, it seems your implementation is simply utilizing the
 >hierarchy table and not having to resolve any keys to generate the
 >report (which was not the desired intent).

See above. It should not matter HOW my implementation generates the
report.

 > In an initial attempt to
 >make them more comparable, I modified XDb1 algorithm so as to not
 >resolve things' names and simply prints their IDs.

Now you're changing the rules after the competition has started. Not a
sign of good sportsmanship, Neo!

But if it is now permitted to produce output with only meaningless IDs
instead of a human readable output, I can optimise my design as well. If I
add an integer ID column to the things table and change all referencing
columns in other tables to use that ID instead of the currect character
key, my solution's performance will increase significantly. Integer
comparison is faster than string comparison, the row size will be smaller,
resulting in more rows per data page = less I/O, etc.

 > Based on a few
 >runs, the report generation time is 2 or 3 ms using the hi-resolution
 >QueryPerformanceCounter function provided by NT on a 500 Mhz PC. In
 >general, the execution is linear to PC's speed thus 2 or 3 x
 >(500/1,300) should be approx 0.77 to 1.2 ms on a 1.3Gz PC. These
 >numbers are too small to make reasonable comparisons.

Yes, I remember making a similar remark myself. But it was not me who set
this challenge, that was you. Smile

So far, you failed to disprove my solution. Instead, you attempt to change
the rules and you claim that the size of the sample data (provided by
yourself!) is too small to make reasonable comparisons. Why do I get the
impression that you're trying to bail out of this?


Groetjes, Hugo
--

Sorry, vandaag geen grappige sig lines meer.
 >> Stay informed about: Nearest Common Ancestor Report (XDb1's $1000 Challenge) 
Back to top
Login to vote
Display posts from previous:   
Related Topics:
developing a Object Oriented Database -

Aggregation in FastObjects. How to define that in opt file? - Hello, I have theses java classes: - Course <<key>> code : short name : string gratingCurricular : set<Period> - Period <<key>> numPeriod : short anything: set<someThing> * For eac...

Help : How to manage stars in database ? - All, Any help would be gratefully appreciated. I have this trouble : In database in one column I have record like that : 000000001 0*6666**66 005555555 Stars could be anything between '0' and '9' So, if I look for this id : 0166662066, it would be..

How to get return type (Resultset) information from databa.. - Hai, For Example Below are two storeprocedures one returns resultset and other not, how get this information through programmatically. 1)CREATE PROCEDURE dbo.UpdateCustomers ( @CustomerID nchar(5), @CompanyName nvarchar(40), @ContactName..

Problems with sorting datas with db4o using orderderAscend.. - Hi, I have created a class "result" with several attributes. Then I have inserted approximately 10000 object result, stored in a yap file. I can get the data and display them. I'd like to know if it is possible, using the functions of db4o, t...
   Database Forums (Home) -> Object-Oriented All times are: Pacific Time (US & Canada)
Goto page 1, 2, 3
Page 1 of 3

 
You can post new topics in this forum
You can reply to topics in this forum
You can edit your posts in this forum
You can delete your posts in this forum
You can vote in polls in this forum



[ Contact us | Terms of Service/Privacy Policy ]