256 |
</Field> |
</Field> |
257 |
</Fields> |
</Fields> |
258 |
</Entity> |
</Entity> |
259 |
|
<Entity name="Coupling" keyType="medium-string"> |
260 |
|
<Notes>A coupling is a relationship between two features. The features are |
261 |
|
physically close on the contig, and there is evidence that they generally |
262 |
|
belong together. The key of this entity is formed by combining the coupled |
263 |
|
feature IDs with a space.</Notes> |
264 |
|
<Fields> |
265 |
|
<Field name="score" type="int"> |
266 |
|
<Notes>A number based on the set of PCHs (pairs of close homologs). A PCH |
267 |
|
indicates that two genes near each other on one genome are very similar to |
268 |
|
genes near each other on another genome. The score only counts PCHs for which |
269 |
|
the genomes are very different. (In other words, we have a pairing that persists |
270 |
|
between different organisms.) A higher score implies a stronger meaning to the |
271 |
|
clustering.</Notes> |
272 |
|
</Field> |
273 |
|
</Fields> |
274 |
|
</Entity> |
275 |
|
<Entity name="PCH" keyType="string"> |
276 |
|
<Notes>A PCH (physically close homolog) connects a clustering (which is a |
277 |
|
pair of physically close features on a contig) to a second pair of physically |
278 |
|
close features that are similar to the first. Essentially, the PCH is a |
279 |
|
relationship between two clusterings in which the first clustering's features |
280 |
|
are similar to the second clustering's features. The simplest model for |
281 |
|
this would be to simply relate clusterings to each other; however, not all |
282 |
|
physically close pairs qualify as clusterings, so we relate a clustering to |
283 |
|
a pair of features. The key is the clustering key followed by the IDs |
284 |
|
of the features in the second pair.</Notes> |
285 |
|
<Fields> |
286 |
|
<Field name="used" type="boolean"> |
287 |
|
<Notes>TRUE if this PCH is used in scoring the attached clustering, |
288 |
|
else FALSE. If a clustering has a PCH for a particular genome and many |
289 |
|
similar genomes are present, then a PCH will probably exist for the |
290 |
|
similar genomes as well. When this happens, only one of the PCHs will |
291 |
|
be scored: the others are considered duplicates of the same evidence.</Notes> |
292 |
|
</Field> |
293 |
|
</Fields> |
294 |
|
</Entity> |
295 |
</Entities> |
</Entities> |
296 |
<Relationships> |
<Relationships> |
297 |
|
<Relationship name="ParticipatesInCoupling" from="Feature" to="Coupling" arity="MM"> |
298 |
|
<Notes>This relationship connects a feature to all the functional couplings |
299 |
|
in which it participates. A functional coupling is a recognition of the fact |
300 |
|
that the features are close to each other on a chromosome, and similar |
301 |
|
features in other genomes also tend to be close.</Notes> |
302 |
|
<Fields> |
303 |
|
<Field name="pos" type="int"> |
304 |
|
<Notes>Ordinal position of the feature in the coupling. Currently, |
305 |
|
this is either "1" or "2".</Notes> |
306 |
|
</Field> |
307 |
|
</Fields> |
308 |
|
<ToIndex> |
309 |
|
<Notes>This index enables the application to view the features of |
310 |
|
a coupling in the proper order. The order influences the way the |
311 |
|
PCHs are examined.</Notes> |
312 |
|
<IndexFields> |
313 |
|
<IndexField name="pos" order="ascending" /> |
314 |
|
</IndexFields> |
315 |
|
</ToIndex> |
316 |
|
</Relationship> |
317 |
|
<Relationship name="IsEvidencedBy" from="Coupling" to="PCH" arity="1M"> |
318 |
|
<Notes>This relationship connects a functional coupling to the physically |
319 |
|
close homologs (PCHs) which affirm that the coupling is meaningful.</Notes> |
320 |
|
</Relationship> |
321 |
|
<Relationship name="UsesAsEvidence" from="PCH" to="Feature" arity="MM"> |
322 |
|
<Notes>This relationship connects a PCH to the features that represent its |
323 |
|
evidence. Each PCH is connected to a parent coupling that relates two features |
324 |
|
on a specific genome. The PCH's evidence that the parent coupling is functional |
325 |
|
is the existence of two physically close features on a different genome that |
326 |
|
correspond to the features in the coupling. Those features are found on the |
327 |
|
far side of this relationship.</Notes> |
328 |
|
<Fields> |
329 |
|
<Field name="pos" type="int"> |
330 |
|
<Notes>Ordinal position of the feature in the coupling that corresponds |
331 |
|
to our target feature. There is a one-to-one correspondence between the |
332 |
|
features connected to the PCH by this relationship and the features |
333 |
|
connected to the PCH's parent coupling. The ordinal position is used |
334 |
|
to decode that relationship. Currently, this field is either "1" or |
335 |
|
"2".</Notes> |
336 |
|
</Field> |
337 |
|
<Field name="percentMatch" type="float"> |
338 |
|
<Notes>Percent similarity between our target feature and the |
339 |
|
corresponding feature of the coupling.</Notes> |
340 |
|
</Field> |
341 |
|
<Field name="paralogs" type="int"> |
342 |
|
<Notes>Number of features on the same genome that are analogous |
343 |
|
to our target feature. A higher paralog count indicates less |
344 |
|
valuable evidence.</Notes> |
345 |
|
</Field> |
346 |
|
</Fields> |
347 |
|
<FromIndex> |
348 |
|
<Notes>This index enables the application to view the features of |
349 |
|
a PCH in the proper order.</Notes> |
350 |
|
<IndexFields> |
351 |
|
<IndexField name="pos" order="ascending" /> |
352 |
|
</IndexFields> |
353 |
|
</FromIndex> |
354 |
|
</Relationship> |
355 |
<Relationship name="HasContig" from="Genome" to="Contig" arity="1M"> |
<Relationship name="HasContig" from="Genome" to="Contig" arity="1M"> |
356 |
<Notes>This relationship connects a genome to the contigs that contain the actual genetic |
<Notes>This relationship connects a genome to the contigs that contain the actual genetic |
357 |
information.</Notes> |
information.</Notes> |
452 |
</IndexFields> |
</IndexFields> |
453 |
</ToIndex> |
</ToIndex> |
454 |
</Relationship> |
</Relationship> |
|
<Relationship name="IsClusteredOnChromosomeWith" from="Feature" to="Feature" arity="MM"> |
|
|
<Notes>This relationship is one of two that relate features to each other. It connects |
|
|
features that are physically close to each other on a single chromosome.</Notes> |
|
|
<Fields> |
|
|
<Field name="score" type="int"> |
|
|
<Notes>The number of co-occurrences in genomes that are not |
|
|
extremely closely-related.</Notes> |
|
|
</Field> |
|
|
</Fields> |
|
|
</Relationship> |
|
455 |
<Relationship name="IsBidirectionalBestHitOf" from="Feature" to="Feature" arity="MM"> |
<Relationship name="IsBidirectionalBestHitOf" from="Feature" to="Feature" arity="MM"> |
456 |
<Notes>This relationship is one of two that relate features to each other. It |
<Notes>This relationship is one of two that relate features to each other. It |
457 |
connects features that are very similar but on separate genomes. A |
connects features that are very similar but on separate genomes. A |