<?xml version="1.0" encoding="ISO-8859-1"?>

<rdf:RDF
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns="http://purl.org/rss/1.0/"
 xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/"
 xmlns:dc="http://purl.org/dc/elements/1.1/"
 xmlns:syn="http://purl.org/rss/1.0/modules/syndication/"
 xmlns:prism="http://purl.org/rss/1.0/modules/prism/"
 xmlns:admin="http://webns.net/mvcb/"
>

<channel rdf:about="http://biostatistics.oxfordjournals.org">
<title>Biostatistics - current issue</title>
<link>http://biostatistics.oxfordjournals.org</link>
<description>Biostatistics - RSS feed of current issue</description>
<prism:eIssn>1468-4357</prism:eIssn>
<prism:coverDisplayDate>July 2008</prism:coverDisplayDate>
<prism:publicationName>Biostatistics</prism:publicationName>
<prism:issn>1465-4644</prism:issn>
<items>
 <rdf:Seq>
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/391?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/400?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/411?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/419?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/432?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/442?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/458?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/467?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/484?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/501?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/513?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/523?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/540?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/555?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/566?rss=1" />
  <rdf:li rdf:resource="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/577?rss=1" />
 </rdf:Seq>
</items>
</channel>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/391?rss=1">
<title><![CDATA[Genetic model selection in two-phase analysis for case-control association studies]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/391?rss=1</link>
<description><![CDATA[
<p>The Cochran&ndash;Armitage trend test (CATT) is well suited for testing association between a marker and a disease in case&ndash;control studies. When the underlying genetic model for the disease is known, the CATT optimal for the genetic model is used. For complex diseases, however, the genetic models of the true disease loci are unknown. In this situation, robust tests are preferable. We propose a two-phase analysis with model selection for the case&ndash;control design. In the first phase, we use the difference of Hardy&ndash;Weinberg disequilibrium coefficients between the cases and the controls for model selection. Then, an optimal CATT corresponding to the selected model is used for testing association. The correlation of the statistics used for selection and the test for association is derived to adjust the two-phase analysis with control of the Type-I error rate. The simulation studies show that this new approach has greater efficiency robustness than the existing methods.</p>
]]></description>
<dc:creator><![CDATA[Zheng, G., Ng, H. K. T.]]></dc:creator>
<dc:date>2008-06-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm039</dc:identifier>
<dc:title><![CDATA[Genetic model selection in two-phase analysis for case-control association studies]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:number>3</prism:number>
<prism:volume>9</prism:volume>
<prism:endingPage>399</prism:endingPage>
<prism:publicationDate>2008-07-01</prism:publicationDate>
<prism:startingPage>391</prism:startingPage>
<prism:section>Articles</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/400?rss=1">
<title><![CDATA[The separation of timescales in Bayesian survival modeling of the time-varying effect of a time-dependent exposure]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/400?rss=1</link>
<description><![CDATA[
<p>In this paper, we apply flexible Bayesian survival analysis methods to investigate the risk of lymphoma associated with kidney transplantation among patients with end-stage renal disease. Of key interest is the potentially time-varying effect of a time-dependent exposure: transplant status. Bayesian modeling of the baseline hazard and the effect of transplant requires consideration of 2 timescales: time since study start and time since transplantation, respectively. Previous related work has not dealt with the separation of multiple timescales. Using a hierarchical model for the hazard function, both timescales are incorporated via conditionally independent stochastic processes; smoothing of each process is specified via intrinsic conditional Gaussian autoregressions. Features of the corresponding posterior distribution are evaluated from draws obtained via a Metropolis&ndash;Hastings&ndash;Green algorithm.</p>
]]></description>
<dc:creator><![CDATA[Haneuse, S. J.-P. A., Rudser, K. D., Gillen, D. L.]]></dc:creator>
<dc:date>2008-06-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm038</dc:identifier>
<dc:title><![CDATA[The separation of timescales in Bayesian survival modeling of the time-varying effect of a time-dependent exposure]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:number>3</prism:number>
<prism:volume>9</prism:volume>
<prism:endingPage>410</prism:endingPage>
<prism:publicationDate>2008-07-01</prism:publicationDate>
<prism:startingPage>400</prism:startingPage>
<prism:section>Articles</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/411?rss=1">
<title><![CDATA[MOST: detecting cancer differential gene expression]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/411?rss=1</link>
<description><![CDATA[
<p>We propose a new statistics for the detection of differentially expressed genes when the genes are activated only in a subset of the samples. Statistics designed for this unconventional circumstance has proved to be valuable for most cancer studies, where oncogenes are activated for a small number of disease samples. Previous efforts made in this direction include cancer outlier profile analysis (Tomlins <I>and others</I>, 2005), outlier sum (Tibshirani and Hastie, 2007), and outlier robust <I>t</I>-statistics (Wu, 2007). We propose a new statistics called maximum ordered subset <I>t</I>-statistics (MOST) which seems to be natural when the number of activated samples is unknown. We compare MOST to other statistics and find that the proposed method often has more power then its competitors.</p>
]]></description>
<dc:creator><![CDATA[Lian, H.]]></dc:creator>
<dc:date>2008-06-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm042</dc:identifier>
<dc:title><![CDATA[MOST: detecting cancer differential gene expression]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:number>3</prism:number>
<prism:volume>9</prism:volume>
<prism:endingPage>418</prism:endingPage>
<prism:publicationDate>2008-07-01</prism:publicationDate>
<prism:startingPage>411</prism:startingPage>
<prism:section>Articles</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/419?rss=1">
<title><![CDATA[Predicting renal graft failure using multivariate longitudinal profiles]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/419?rss=1</link>
<description><![CDATA[
<p>Patients who have undergone renal transplantation are monitored longitudinally at irregular time intervals over 10 years or more. This yields a set of biochemical and physiological markers containing valuable information to anticipate a failure of the graft. A general linear, generalized linear, or nonlinear mixed model is used to describe the longitudinal profile of each marker. To account for the correlation between markers, the univariate mixed models are combined into a multivariate mixed model (MMM) by specifying a joint distribution for the random effects. Due to the high number of markers, a pairwise modeling strategy, where all possible pairs of bivariate mixed models are fitted, is used to obtain parameter estimates for the MMM. These estimates are used in a Bayes rule to obtain, at each point in time, the prognosis for long-term success of the transplant. It is shown that allowing the markers to be correlated can improve this prognosis.</p>
]]></description>
<dc:creator><![CDATA[Fieuws, S., Verbeke, G., Maes, B., Vanrenterghem, Y.]]></dc:creator>
<dc:date>2008-06-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm041</dc:identifier>
<dc:title><![CDATA[Predicting renal graft failure using multivariate longitudinal profiles]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:number>3</prism:number>
<prism:volume>9</prism:volume>
<prism:endingPage>431</prism:endingPage>
<prism:publicationDate>2008-07-01</prism:publicationDate>
<prism:startingPage>419</prism:startingPage>
<prism:section>Articles</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/432?rss=1">
<title><![CDATA[Sparse inverse covariance estimation with the graphical lasso]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/432?rss=1</link>
<description><![CDATA[
<p>We consider the problem of estimating sparse graphs by a lasso penalty applied to the inverse covariance matrix. Using a coordinate descent procedure for the lasso, we develop a simple algorithm&mdash;the <I>graphical lasso</I>&mdash;that is remarkably fast: It solves a 1000-node problem (~500000 parameters) in at most a minute and is 30&ndash;4000 times faster than competing methods. It also provides a conceptual link between the exact problem and the approximation suggested by Meinshausen and B&uuml;hlmann (2006). We illustrate the method on some cell-signaling data from proteomics.</p>
]]></description>
<dc:creator><![CDATA[Friedman, J., Hastie, T., Tibshirani, R.]]></dc:creator>
<dc:date>2008-06-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm045</dc:identifier>
<dc:title><![CDATA[Sparse inverse covariance estimation with the graphical lasso]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:number>3</prism:number>
<prism:volume>9</prism:volume>
<prism:endingPage>441</prism:endingPage>
<prism:publicationDate>2008-07-01</prism:publicationDate>
<prism:startingPage>432</prism:startingPage>
<prism:section>Articles</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/442?rss=1">
<title><![CDATA[Monitoring late-onset toxicities in phase I trials using predicted risks]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/442?rss=1</link>
<description><![CDATA[
<p>Late-onset (LO) toxicities are a serious concern in many phase I trials. Since most dose-limiting toxicities occur soon after therapy begins, most dose-finding methods use a binary indicator of toxicity occurring within a short initial time period. If an agent causes LO toxicities, however, an undesirably large number of patients may be treated at toxic doses before any toxicities are observed. A method addressing this problem is the time-to-event continual reassessment method (TITE-CRM, Cheung and Chappell, 2000). We propose a Bayesian dose-finding method similar to the TITE-CRM in which doses are chosen using time-to-toxicity data. The new aspect of our method is a set of rules, based on predictive probabilities, that temporarily suspend accrual if the risk of toxicity at prospective doses for future patients is unacceptably high. If additional follow-up data reduce the predicted risk of toxicity to an acceptable level, then accrual is restarted, and this process may be repeated several times during the trial. A simulation study shows that the proposed method provides a greater degree of safety than the TITE-CRM, while still reliably choosing the preferred dose. This advantage increases with accrual rate, but the price of this additional safety is that the trial takes longer to complete on average.</p>
]]></description>
<dc:creator><![CDATA[Bekele, B. N., Ji, Y., Shen, Y., Thall, P. F.]]></dc:creator>
<dc:date>2008-06-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm044</dc:identifier>
<dc:title><![CDATA[Monitoring late-onset toxicities in phase I trials using predicted risks]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:number>3</prism:number>
<prism:volume>9</prism:volume>
<prism:endingPage>457</prism:endingPage>
<prism:publicationDate>2008-07-01</prism:publicationDate>
<prism:startingPage>442</prism:startingPage>
<prism:section>Articles</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/458?rss=1">
<title><![CDATA[Significance levels for studies with correlated test statistics]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/458?rss=1</link>
<description><![CDATA[
<p>When testing large numbers of null hypotheses, one needs to assess the evidence against the global null hypothesis that none of the hypotheses is false. Such evidence typically is based on the test statistic of the largest magnitude, whose statistical significance is evaluated by permuting the sample units to simulate its null distribution. Efron (2007) has noted that correlation among the test statistics can induce substantial interstudy variation in the shapes of their histograms, which may cause misleading tail counts. Here, we show that permutation-based estimates of the overall significance level also can be misleading when the test statistics are correlated. We propose that such estimates be conditioned on a simple measure of the spread of the observed histogram, and we provide a method for obtaining conditional significance levels. We justify this conditioning using the conditionality principle described by Cox and Hinkley (1974). Application of the method to gene expression data illustrates the circumstances when conditional significance levels are needed.</p>
]]></description>
<dc:creator><![CDATA[Shi, J., Levinson, D. F., Whittemore, A. S.]]></dc:creator>
<dc:date>2008-06-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm047</dc:identifier>
<dc:title><![CDATA[Significance levels for studies with correlated test statistics]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:number>3</prism:number>
<prism:volume>9</prism:volume>
<prism:endingPage>466</prism:endingPage>
<prism:publicationDate>2008-07-01</prism:publicationDate>
<prism:startingPage>458</prism:startingPage>
<prism:section>Articles</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/467?rss=1">
<title><![CDATA[Complementary hierarchical clustering]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/467?rss=1</link>
<description><![CDATA[
<p>When applying hierarchical clustering algorithms to cluster patient samples from microarray data, the clustering patterns generated by most algorithms tend to be dominated by groups of highly differentially expressed genes that have closely related expression patterns. Sometimes, these genes may not be relevant to the biological process under study or their functions may already be known. The problem is that these genes can potentially drown out the effects of other genes that are relevant or have novel functions. We propose a procedure called complementary hierarchical clustering that is designed to uncover the structures arising from these novel genes that are not as highly expressed. Simulation studies show that the procedure is effective when applied to a variety of examples. We also define a concept called relative gene importance that can be used to identify the influential genes in a given clustering. Finally, we analyze a microarray data set from 295 breast cancer patients, using clustering with the correlation-based distance measure. The complementary clustering reveals a grouping of the patients which is uncorrelated with a number of known prognostic signatures and significantly differing distant metastasis-free probabilities.</p>
]]></description>
<dc:creator><![CDATA[Nowak, G., Tibshirani, R.]]></dc:creator>
<dc:date>2008-06-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm046</dc:identifier>
<dc:title><![CDATA[Complementary hierarchical clustering]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:number>3</prism:number>
<prism:volume>9</prism:volume>
<prism:endingPage>483</prism:endingPage>
<prism:publicationDate>2008-07-01</prism:publicationDate>
<prism:startingPage>467</prism:startingPage>
<prism:section>Articles</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/484?rss=1">
<title><![CDATA[Weighted clustering of called array CGH data]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/484?rss=1</link>
<description><![CDATA[
<p>Array comparative genomic hybridization (aCGH) is a laboratory technique to measure chromosomal copy number changes. A clear biological interpretation of the measurements is obtained by mapping these onto an ordinal scale with categories loss/normal/gain of a copy. The pattern of gains and losses harbors a level of tumor specificity. Here, we present WECCA (weighted clustering of called aCGH data), a method for weighted clustering of samples on the basis of the ordinal aCGH data. Two similarities to be used in the clustering and particularly suited for ordinal data are proposed, which are generalized to deal with weighted observations. In addition, a new form of linkage, especially suited for ordinal data, is introduced. In a simulation study, we show that the proposed cluster method is competitive to clustering using the continuous data. We illustrate WECCA using an application to a breast cancer data set, where WECCA finds a clustering that relates better with survival than the original one.</p>
]]></description>
<dc:creator><![CDATA[Van Wieringen, W. N., Van De Wiel, M. A., Ylstra, B.]]></dc:creator>
<dc:date>2008-06-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm048</dc:identifier>
<dc:title><![CDATA[Weighted clustering of called array CGH data]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:number>3</prism:number>
<prism:volume>9</prism:volume>
<prism:endingPage>500</prism:endingPage>
<prism:publicationDate>2008-07-01</prism:publicationDate>
<prism:startingPage>484</prism:startingPage>
<prism:section>Articles</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/501?rss=1">
<title><![CDATA[A simulation-based marginal method for longitudinal data with dropout and mismeasured covariates]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/501?rss=1</link>
<description><![CDATA[
<p>Longitudinal data often contain missing observations and error-prone covariates. Extensive attention has been directed to analysis methods to adjust for the bias induced by missing observations. There is relatively little work on investigating the effects of covariate measurement error on estimation of the response parameters, especially on simultaneously accounting for the biases induced by both missing values and mismeasured covariates. It is not clear what the impact of ignoring measurement error is when analyzing longitudinal data with both missing observations and error-prone covariates. In this article, we study the effects of covariate measurement error on estimation of the response parameters for longitudinal studies. We develop an inference method that adjusts for the biases induced by measurement error as well as by missingness. The proposed method does not require the full specification of the distribution of the response vector but only requires modeling its mean and variance structures. Furthermore, the proposed method employs the so-called functional modeling strategy to handle the covariate process, with the distribution of covariates left unspecified. These features, plus the simplicity of implementation, make the proposed method very attractive. In this paper, we establish the asymptotic properties for the resulting estimators. With the proposed method, we conduct sensitivity analyses on a cohort data set arising from the Framingham Heart Study. Simulation studies are carried out to evaluate the impact of ignoring covariate measurement error and to assess the performance of the proposed method.</p>
]]></description>
<dc:creator><![CDATA[Yi, G. Y.]]></dc:creator>
<dc:date>2008-06-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm054</dc:identifier>
<dc:title><![CDATA[A simulation-based marginal method for longitudinal data with dropout and mismeasured covariates]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:number>3</prism:number>
<prism:volume>9</prism:volume>
<prism:endingPage>512</prism:endingPage>
<prism:publicationDate>2008-07-01</prism:publicationDate>
<prism:startingPage>501</prism:startingPage>
<prism:section>Articles</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/513?rss=1">
<title><![CDATA[Statistical models for quantifying diagnostic accuracy with multiple lesions per patient]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/513?rss=1</link>
<description><![CDATA[
<p>We propose random-effects models to summarize and quantify the accuracy of the diagnosis of multiple lesions on a single image without assuming independence between lesions. The number of false-positive lesions was assumed to be distributed as a Poisson mixture, and the proportion of true-positive lesions was assumed to be distributed as a binomial mixture. We considered univariate and bivariate, both parametric and nonparametric mixture models. We applied our tools to simulated data and data of a study assessing diagnostic accuracy of virtual colonography with computed tomography in 200 patients suspected of having one or more polyps.</p>
]]></description>
<dc:creator><![CDATA[Zwinderman, A. H., Glas, A. S., Bossuyt, P. M., Florie, J., Bipat, S., Stoker, J.]]></dc:creator>
<dc:date>2008-06-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm052</dc:identifier>
<dc:title><![CDATA[Statistical models for quantifying diagnostic accuracy with multiple lesions per patient]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:number>3</prism:number>
<prism:volume>9</prism:volume>
<prism:endingPage>522</prism:endingPage>
<prism:publicationDate>2008-07-01</prism:publicationDate>
<prism:startingPage>513</prism:startingPage>
<prism:section>Articles</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/523?rss=1">
<title><![CDATA[Penalized loss functions for Bayesian model comparison]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/523?rss=1</link>
<description><![CDATA[
<p>The deviance information criterion (DIC) is widely used for Bayesian model comparison, despite the lack of a clear theoretical foundation. DIC is shown to be an approximation to a penalized loss function based on the deviance, with a penalty derived from a cross-validation argument. This approximation is valid only when the effective number of parameters in the model is much smaller than the number of independent observations. In disease mapping, a typical application of DIC, this assumption does not hold and DIC under-penalizes more complex models. Another deviance-based loss function, derived from the same decision-theoretic framework, is applied to mixture models, which have previously been considered an unsuitable application for DIC</p>
]]></description>
<dc:creator><![CDATA[Plummer, M.]]></dc:creator>
<dc:date>2008-06-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm049</dc:identifier>
<dc:title><![CDATA[Penalized loss functions for Bayesian model comparison]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:number>3</prism:number>
<prism:volume>9</prism:volume>
<prism:endingPage>539</prism:endingPage>
<prism:publicationDate>2008-07-01</prism:publicationDate>
<prism:startingPage>523</prism:startingPage>
<prism:section>Articles</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/540?rss=1">
<title><![CDATA[Mixture models with multiple levels, with application to the analysis of multifactor gene expression data]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/540?rss=1</link>
<description><![CDATA[
<p>Model-based clustering is a popular tool for summarizing high-dimensional data. With the number of high-throughput large-scale gene expression studies still on the rise, the need for effective data- summarizing tools has never been greater. By grouping genes according to a common experimental expression profile, we may gain new insight into the biological pathways that steer biological processes of interest. Clustering of gene profiles can also assist in assigning functions to genes that have not yet been functionally annotated. In this paper, we propose 2 model selection procedures for model-based clustering. Model selection in model-based clustering has to date focused on the identification of data dimensions that are relevant for clustering. However, in more complex data structures, with multiple experimental factors, such an approach does not provide easily interpreted clustering outcomes. We propose a mixture model with multiple levels, <f><inline-fig>
<link locator="biostskxm051fx1_ht"></inline-fig></f>, that provides sparse representations both "within" and "between" cluster profiles. We explore various flexible "within-cluster" parameterizations and discuss how efficient parameterizations can greatly enhance the objective interpretability of the generated clusters. Moreover, we allow for a sparse "between-cluster" representation with a different number of clusters at different levels of an experimental factor of interest. This enhances interpretability of clusters generated in multiple-factor contexts. Interpretable cluster profiles can assist in detecting biologically relevant groups of genes that may be missed with less efficient parameterizations. We use our multilevel mixture model to mine a proliferating cell line expression data set for annotational context and regulatory motifs. We also investigate the performance of the multilevel clustering approach on several simulated data sets.</p>
]]></description>
<dc:creator><![CDATA[Jornsten, R., Keles, S.]]></dc:creator>
<dc:date>2008-06-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm051</dc:identifier>
<dc:title><![CDATA[Mixture models with multiple levels, with application to the analysis of multifactor gene expression data]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:number>3</prism:number>
<prism:volume>9</prism:volume>
<prism:endingPage>554</prism:endingPage>
<prism:publicationDate>2008-07-01</prism:publicationDate>
<prism:startingPage>540</prism:startingPage>
<prism:section>Articles</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/555?rss=1">
<title><![CDATA[Linear mixed models for longitudinal shape data with applications to facial modeling]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/555?rss=1</link>
<description><![CDATA[
<p>We present a novel application of methods for analysis of high-dimensional longitudinal data to a comparison of facial shape over time between babies with cleft lip and palate and similarly aged controls. A pairwise methodology is used that was introduced in Fieuws and Verbeke (2006) in order to apply a linear mixed-effects model to data of high dimensions, such as describe facial shape. The approach involves fitting bivariate linear mixed-effects models to all the pairwise combinations of responses, where the latter result from the individual coordinate positions, and aggregating the results across repeated parameter estimates (such as the random-effects variance for a particular coordinate). We describe one example using landmarks and another using facial curves from the cleft lip study, the latter using B-splines to provide an efficient parameterization. The results are presented in 2 dimensions, both in the profile and in the frontal views, with bivariate confidence intervals for the mean position of each landmark or curve, allowing objective assessment of significant differences in particular areas of the face between the 2 groups. Model comparison is performed using Wald and pseudolikelihood ratio tests.</p>
]]></description>
<dc:creator><![CDATA[Barry, S. J. E., Bowman, A. W.]]></dc:creator>
<dc:date>2008-06-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm056</dc:identifier>
<dc:title><![CDATA[Linear mixed models for longitudinal shape data with applications to facial modeling]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:number>3</prism:number>
<prism:volume>9</prism:volume>
<prism:endingPage>565</prism:endingPage>
<prism:publicationDate>2008-07-01</prism:publicationDate>
<prism:startingPage>555</prism:startingPage>
<prism:section>Articles</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/566?rss=1">
<title><![CDATA[ROC analysis with multiple classes and multiple tests: methodology and its application in microarray studies]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/566?rss=1</link>
<description><![CDATA[
<p>The accuracy of a single diagnostic test for binary outcome can be summarized by the area under the receiver operating characteristic (ROC) curve. Volume under the surface and hypervolume under the manifold have been proposed as extensions for multiple class diagnosis (Scurfield, 1996, 1998). However, the lack of simple inferential procedures for such measures has limited their practical utility. Part of the difficulty is that calculating such quantities may not be straightforward, even with a single test. The decision rule used to generate the ROC surface requires class probability assessments, which are not provided by the tests. We develop a method based on estimating the probabilities via some procedure, for example, multinomial logistic regression. Bootstrap inferences are proposed to account for variability in estimating the probabilities and perform well in simulations. The ROC measures are compared to the correct classification rate, which depends heavily on class prevalences. An example of tumor classification with microarray data demonstrates that this property may lead to substantially different analyses. The ROC-based analysis yields notable decreases in model complexity over previous analyses.</p>
]]></description>
<dc:creator><![CDATA[Li, J., Fine, J. P.]]></dc:creator>
<dc:date>2008-06-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxm050</dc:identifier>
<dc:title><![CDATA[ROC analysis with multiple classes and multiple tests: methodology and its application in microarray studies]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:number>3</prism:number>
<prism:volume>9</prism:volume>
<prism:endingPage>576</prism:endingPage>
<prism:publicationDate>2008-07-01</prism:publicationDate>
<prism:startingPage>566</prism:startingPage>
<prism:section>Articles</prism:section>
</item>

<item rdf:about="http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/577?rss=1">
<title><![CDATA[Regression models for infant mortality data in Norwegian siblings, using a compound Poisson frailty distribution with random scale]]></title>
<link>http://biostatistics.oxfordjournals.org/cgi/content/short/9/3/577?rss=1</link>
<description><![CDATA[
<p>The power variance function distributions, which include the gamma and compound Poisson (CP) distributions among others, are commonly used in frailty models for family data. In a previous paper, we presented a frailty model constructed by randomizing the scale parameter in a CP distribution. When combined with a parametric baseline hazard, this yields a model with heterogeneity on both the individual and the family level and a subgroup with zero frailty, corresponding to people not experiencing the event. In this paper, we discuss covariates in the model. Depending on where the covariates are inserted in the model, one may have proportional hazards at the individual level, the family level, and a larger group level (for covariates shared by many families, e.g. ethnic groups) or get accelerated failure times. Each of these alternatives gives a specific interpretation of the covariate effects. An application to data infant mortality in siblings from the Medical Birth Registry of Norway is included. We compare the results for some of the different covariate modeling options.</p>
]]></description>
<dc:creator><![CDATA[Moger, T. A., Aalen, O. O.]]></dc:creator>
<dc:date>2008-06-18</dc:date>
<dc:identifier>info:doi/10.1093/biostatistics/kxn003</dc:identifier>
<dc:title><![CDATA[Regression models for infant mortality data in Norwegian siblings, using a compound Poisson frailty distribution with random scale]]></dc:title>
<dc:publisher>Biometrika Trust</dc:publisher>
<prism:number>3</prism:number>
<prism:volume>9</prism:volume>
<prism:endingPage>591</prism:endingPage>
<prism:publicationDate>2008-07-01</prism:publicationDate>
<prism:startingPage>577</prism:startingPage>
<prism:section>Articles</prism:section>
</item>

</rdf:RDF>